71 research outputs found

    DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models

    Full text link
    Self-supervised learning (SSL) has achieved notable success in many speech processing tasks, but the large model size and heavy computational cost hinder the deployment. Knowledge distillation trains a small student model to mimic the behavior of a large teacher model. However, the student architecture usually needs to be manually designed and will remain fixed during training, which requires prior knowledge and can lead to suboptimal performance. Inspired by recent success of task-specific structured pruning, we propose DPHuBERT, a novel task-agnostic compression method for speech SSL based on joint distillation and pruning. Experiments on SUPERB show that DPHuBERT outperforms pure distillation methods in almost all tasks. Moreover, DPHuBERT requires little training time and performs well with limited training data, making it suitable for resource-constrained applications. Our method can also be applied to various speech SSL models. Our code and models will be publicly available.Comment: Accepted at INTERSPEECH 2023. Code will be available at: https://github.com/pyf98/DPHuBER

    4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders

    Full text link
    The network architecture of end-to-end (E2E) automatic speech recognition (ASR) can be classified into several models, including connectionist temporal classification (CTC), recurrent neural network transducer (RNN-T), attention mechanism, and non-autoregressive mask-predict models. Since each of these network architectures has pros and cons, a typical use case is to switch these separate models depending on the application requirement, resulting in the increased overhead of maintaining all models. Several methods for integrating two of these complementary models to mitigate the overhead issue have been proposed; however, if we integrate more models, we will further benefit from these complementary models and realize broader applications with a single system. This paper proposes four-decoder joint modeling (4D) of CTC, attention, RNN-T, and mask-predict, which has the following three advantages: 1) The four decoders are jointly trained so that they can be easily switched depending on the application scenarios. 2) Joint training may bring model regularization and improve the model robustness thanks to their complementary properties. 3) Novel one-pass joint decoding methods using CTC, attention, and RNN-T further improves the performance. The experimental results showed that the proposed model consistently reduced the WER.Comment: Accepted by INTERRSPEECH202

    Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search

    Full text link
    End-to-end (E2E) automatic speech recognition (ASR) methods exhibit remarkable performance. However, since the performance of such methods is intrinsically linked to the context present in the training data, E2E-ASR methods do not perform as desired for unseen user contexts (e.g., technical terms, personal names, and playlists). Thus, E2E-ASR methods must be easily contextualized by the user or developer. This paper proposes an attention-based contextual biasing method that can be customized using an editable phrase list (referred to as a bias list). The proposed method can be trained effectively by combining a bias phrase index loss and special tokens to detect the bias phrases in the input speech data. In addition, to improve the contextualization performance during inference further, we propose a bias phrase boosted (BPB) beam search algorithm based on the bias phrase index probability. Experimental results demonstrate that the proposed method consistently improves the word error rate and the character error rate of the target phrases in the bias list on both the Librispeech-960 (English) and our in-house (Japanese) dataset, respectively.Comment: accepted by ICASSP2022

    Cisplatin-induced programmed cell death ligand-2 expression is associated with metastasis ability in oral squamous cell carcinoma.

    Get PDF
    Programmed cell death ligands (PD-Ls) are expressed in tumor cells where they bind to programmed cell death-1, an immunocyte co-receptor, resulting in tumor cell evasion from the immune system. Chemotherapeutic drugs have been recently reported to induce the expression of PD-L, such as PD-L1, in some cancer cells. However, little is known regarding PD-L2 expression and its role in oral squamous cell carcinoma (OSCC). In this study, we examined the effect of cisplatin on the expression and regulation of PD-L2 in OSCC cell lines and analyzed malignant behavior in PD-L2-expressing cells using colony, transwell and transformation assays. In addition, we examined PD-L2 expression in the tumor tissues of OSCC patients using cytology and tissue microarray methods. In OSCC cell lines, cisplatin treatment upregulated PD-L2 expression, along with that of the drug efflux transporter ABCG2, via signal transducers and activator of transcription (STAT) 1/3 activation. Moreover, PD-L2-positive or PD-L2-overexpressing cells demonstrated upregulation in both invasion and transformation ability but not in proliferation compared with PD-L2-negative or PD-L2-silencing cells. PD-L2 expression was also observed in OSCC cells of cytology samples and tissue from OSCC patients. The intensity of PD-L2 expression was correlated with more malignant morphological features in the histological appearance and an invasive pattern. Our findings indicate that cisplatin-upregulated PD-L2 expression in OSCC via STAT1/3 activation and the expression of PD-L2 are likely to be associated with malignancy in OSCC. The PD-L2 expression in cisplatin-resistant OSCC cells may be a critical factor in prognosis of advanced OSCC patients.福岡歯科大学2019年

    Midterm results of left coronary artery reimplantation through the transverse sinus of the pericardium in adult Bland-White-Garland syndrome

    Get PDF
    The anomalous origin of the left coronary artery from the pulmonary artery - known as Bland-White-Garland syndrome - is a rare congenital malformation that affects 1 in 300,000 live births. Most patients die in infancy without any surgical treatment. Some patients who survive past childhood often have varying symptoms such as myocardial ischemia, impaired left ventricular function, mitral regurgitation, and progressive heart failure, depending on the development collateral circulation. In the present report, we describe a procedure wherein the left coronary artery ostium was translocated through the transverse sinus of the pericardium in a 43-year-old mother with Bland-White-Garland syndrome and concomitant mitral regurgitation and report on the associated midterm results

    Development of an atmospheric N2O isotopocule model and optimization procedure, and application to source estimation

    Get PDF
    This paper presents the development of an atmospheric N2O isotopocule model based on a chemistry-coupled atmospheric general circulation model (ACTM). We also describe a simple method to optimize the model and present its use in estimating the isotopic signatures of surface sources at the hemispheric scale. Data obtained from ground-based observations, measurements of firn air, and balloon and aircraft flights were used to optimize the long-term trends, interhemispheric gradients, and photolytic fractionation, respectively, in the model. This optimization successfully reproduced realistic spatial and temporal variations of atmospheric N2O isotopocules throughout the atmosphere from the surface to the stratosphere. The very small gradients associated with vertical profiles through the troposphere and the latitudinal and vertical distributions within each hemisphere were also reasonably simulated. The results of the isotopic characterization of the global total sources were generally consistent with previous one-box model estimates, indicating that the observed atmospheric trend is the dominant factor controlling the source isotopic signature. However, hemispheric estimates were different from those generated by a previous two-box model study, mainly due to the model accounting for the interhemispheric transport and latitudinal and vertical distributions of tropospheric N2O isotopocules. Comparisons of time series of atmospheric N2O isotopocule ratios between our model and observational data from several laboratories revealed the need for a more systematic and elaborate intercalibration of the standard scales used in N2O isotopic measurements in order to capture a more complete and precise picture of the temporal and spatial variations in atmospheric N2O isotopocule ratios. This study highlights the possibility that inverse estimation of surface N2O fluxes, including the isotopic information as additional constraints, could be realized

    Label-free multiphoton excitation imaging as a promising diagnostic tool for breast cancer

    Get PDF
    Histopathological diagnosis is the ultimate method of attaining the final diagnosis; however, the observation range is limited to the two-dimensional plane, and it requires thin slicing of the tissue, which limits diagnostic information. To seek solutions for these problems, we proposed a novel imaging-based histopathological examination. We used the multiphoton excitation microscopy (MPM) technique to establish a method for visualizing unfixed/unstained human breast tissues. Under near-infrared ray excitation, fresh human breast tissues emitted fluorescent signals with three major peaks, which enabled visualizing the breast tissue morphology without any fixation or dye staining. Our study using human breast tissue samples from 32 patients indicated that experienced pathologists can estimate normal or cancerous lesions using only these MPM images with a kappa coefficient of 1.0. Moreover, we developed an image classification algorithm with artificial intelligence that enabled us to automatically define cancer cells in small areas with a high sensitivity of ≥0.942. Taken together, label-free MPM imaging is a promising method for the real-time automatic diagnosis of breast cancer.This is the pre-peer reviewed version of the following article:Matsui T., Iwasa A., Mimura M., et al. Label-free multiphoton excitation imaging as a promising diagnostic tool for breast cancer. Cancer Science 113, 2916 (2022), which has been published in final form at https://doi.org/10.1111/cas.15428. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving

    Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

    Full text link
    Pre-training speech models on large volumes of data has achieved remarkable success. OpenAI Whisper is a multilingual multitask model trained on 680k hours of supervised speech data. It generalizes well to various speech recognition and translation benchmarks even in a zero-shot setup. However, the full pipeline for developing such models (from data collection to training) is not publicly accessible, which makes it difficult for researchers to further improve its performance and address training-related issues such as efficiency, robustness, fairness, and bias. This work presents an Open Whisper-style Speech Model (OWSM), which reproduces Whisper-style training using an open-source toolkit and publicly available data. OWSM even supports more translation directions and can be more efficient to train. We will publicly release all scripts used for data preparation, training, inference, and scoring as well as pre-trained models and training logs to promote open science.Comment: Accepted at ASRU 202

    Clinicopathological significance of antinuclear antibodies in non-alcoholic steatohepatitis

    Get PDF
    金沢大学大学院医学系研究科がん細胞学Aim: Serum antinuclear antibodies (ANA) are occasionally noted in patients with non-alcoholic steatohepatitis (NASH). We examined the significance of ANA in NASH. Methods: We compared clinicopathological features in patients with ANA-positive NASH (n = 35) and ANA-negative NASH (n = 36). Inflammatory cell profiles and the distribution of oxidative stress markers were also examined immunohistochemically. Results: ANA-positive NASH was significantly associated with female gender (P = 0.005), high degree of portal inflammation (P = 0.039), interface activity (P = 0.036) and hepatocellular ballooning (P = 0.0008). In addition, ANA of high titer (320-fold or more) was significantly associated with the histological grade and stage of NASH (P = 0.02). The degree of steatosis wais rather mild in the high-titer ANA group(P = 0.01). The analysis of inflammatory cell profiles revealed that CD3-positive T cells were predominant and plasma cells were rather few in the portal area and hepatic lobules in both ANA-positive and ANA-negative groups. There was no difference in the distribution of oxidative stress markers between ANA-positive and ANA-negative groups. Conclusion: These findings suggest that the presence of ANA may be related to the progression of NASH and that a different type of autoimmune mechanism may be involved in the pathogenesis of NASH with ANA, compared to the pathogenesis of autoimmune hepatitis. © 2007 The Japan Society of Hepatology

    Recent Results from LHD Experiment with Emphasis on Relation to Theory from Experimentalist’s View

    Get PDF
    he Large Helical Device (LHD) has been extending an operational regime of net-current free plasmas towardsthe fusion relevant condition with taking advantage of a net current-free heliotron concept and employing a superconducting coil system. Heating capability has exceeded 10 MW and the central ion and electron temperatureshave reached 7 and 10 keV, respectively. The maximum value of β and pulse length have been extended to 3.2% and 150 s, respectively. Many encouraging physical findings have been obtained. Topics from recent experiments, which should be emphasized from the aspect of theoretical approaches, are reviewed. Those are (1) Prominent features in the inward shifted configuration, i.e., mitigation of an ideal interchange mode in the configuration with magnetic hill, and confinement improvement due to suppression of both anomalous and neoclassical transport, (2) Demonstration ofbifurcation of radial electric field and associated formation of an internal transport barrier, and (3) Dynamics of magnetic islands and clarification of the role of separatrix
    corecore