25 research outputs found

    CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training

    Full text link
    Speech or text representation generated by pre-trained models contains modal-specific information that could be combined for benefiting spoken language understanding (SLU) tasks. In this work, we propose a novel pre-training paradigm termed Continuous Integrate-and-Fire Pre-Training (CIF-PT). It relies on a simple but effective frame-to-token alignment: continuous integrate-and-fire (CIF) to bridge the representations between speech and text. It jointly performs speech-to-text training and language model distillation through CIF as the pre-training (PT). Evaluated on SLU benchmark SLURP dataset, CIF-PT outperforms the state-of-the-art model by 1.94% of accuracy and 2.71% of SLU-F1 on the tasks of intent classification and slot filling, respectively. We also observe the cross-modal representation extracted by CIF-PT obtains better performance than other neural interfaces for the tasks of SLU, including the dominant speech representation learned from self-supervised pre-training.Comment: Accepted by ACL 2023 Finding

    Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

    Full text link
    Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents.~\footnote{Working in progress.}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}Comment: Working in progres

    Fault Identification of Electric Submersible Pumps Based on Unsupervised and Multi-Source Transfer Learning Integration

    No full text
    The ratio between normal data and fault data generated by electric submersible pumps (ESPs) in production is prone to imbalance, and the information carried by the fault data generally as a minority sample is easily overwritten by the normal data as a majority sample, which seriously interferes with the fault identification effect. For the problem that data imbalance under different working conditions of ESPs causes the failure data to not be effectively identified, a fault identification method of ESPs based on unsupervised feature extraction integrated with migration learning was proposed. Firstly, new features were extracted from the data using multiple unsupervised methods to enhance the representational power of the data. Secondly, multiple samples of the source domain were obtained by multiple random sampling of the training set to fully train minority samples. Thirdly, the variation between the source domain and target domain was reduced by combining weighted balanced distribution adaptation (W-BDA). Finally, several basic learners were constructed and combined to integrate a stronger classifier to accomplish the ESP fault identification tasks. Compared with other fault identification methods, our method not only effectively enhances the performance of fault data features and improves the identification of a few fault data, but also copes with fault identification under different working conditions

    Improving Contextual Representation with Gloss Regularized Pre-training

    Full text link
    Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) encounter the discrepancy between pre-training and inference. In light of this gap, we investigate the contextual representation of pre-training and inference from the perspective of word probability distribution. We discover that BERT risks neglecting the contextual word similarity in pre-training. To tackle this issue, we propose an auxiliary gloss regularizer module to BERT pre-training (GR-BERT), to enhance word semantic similarity. By predicting masked words and aligning contextual embeddings to corresponding glosses simultaneously, the word similarity can be explicitly modeled. We design two architectures for GR-BERT and evaluate our model in downstream tasks. Experimental results show that the gloss regularizer benefits BERT in word-level and sentence-level semantic representation. The GR-BERT achieves new state-of-the-art in lexical substitution task and greatly promotes BERT sentence representation in both unsupervised and supervised STS tasks.Comment: Accepted to Findings of NAACL 202

    Fault Identification of Electric Submersible Pumps Based on Unsupervised and Multi-Source Transfer Learning Integration

    No full text
    The ratio between normal data and fault data generated by electric submersible pumps (ESPs) in production is prone to imbalance, and the information carried by the fault data generally as a minority sample is easily overwritten by the normal data as a majority sample, which seriously interferes with the fault identification effect. For the problem that data imbalance under different working conditions of ESPs causes the failure data to not be effectively identified, a fault identification method of ESPs based on unsupervised feature extraction integrated with migration learning was proposed. Firstly, new features were extracted from the data using multiple unsupervised methods to enhance the representational power of the data. Secondly, multiple samples of the source domain were obtained by multiple random sampling of the training set to fully train minority samples. Thirdly, the variation between the source domain and target domain was reduced by combining weighted balanced distribution adaptation (W-BDA). Finally, several basic learners were constructed and combined to integrate a stronger classifier to accomplish the ESP fault identification tasks. Compared with other fault identification methods, our method not only effectively enhances the performance of fault data features and improves the identification of a few fault data, but also copes with fault identification under different working conditions

    A computational strategy for finding novel targets and therapeutic compounds for opioid dependence.

    No full text
    Opioids are widely used for treating different types of pains, but overuse and abuse of prescription opioids have led to opioid epidemic in the United States. Besides analgesic effects, chronic use of opioid can also cause tolerance, dependence, and even addiction. Effective treatment of opioid addiction remains a big challenge today. Studies on addictive effects of opioids focus on striatum, a main component in the brain responsible for drug dependence and addiction. Some transcription regulators have been associated with opioid addiction, but relationship between analgesic effects of opioids and dependence behaviors mediated by them at the molecular level has not been thoroughly investigated. In this paper, we developed a new computational strategy that identifies novel targets and potential therapeutic molecular compounds for opioid dependence and addiction. We employed several statistical and machine learning techniques and identified differentially expressed genes over time which were associated with dependence-related behaviors after exposure to either morphine or heroin, as well as potential transcription regulators that regulate these genes, using time course gene expression data from mouse striatum. Moreover, our findings revealed that some of these dependence-associated genes and transcription regulators are known to play key roles in opioid-mediated analgesia and tolerance, suggesting that an intricate relationship between opioid-induce pain-related pathways and dependence may develop at an early stage during opioid exposure. Finally, we determined small compounds that can potentially target the dependence-associated genes and transcription regulators. These compounds may facilitate development of effective therapy for opioid dependence and addiction. We also built a database (http://daportals.org) for all opioid-induced dependence-associated genes and transcription regulators that we discovered, as well as the small compounds that target those genes and transcription regulators

    A Semi-Analytical Solution for Shock Wave Pressure and Radius of Soil Plastic Zone Induced by Lightning Strikes

    No full text
    A semi-analytical solution for forecasting the soil behavior induced by lightning strikes is of great engineering significance to calculate the radius of the soil plastic zone. In this paper, a simplified two-stage method is employed to solve the shock wave pressure and the radius of the soil plastic zone. The solution is verified against experimental data. Using the present model, the major factors dominating the shock wave pressure and the radius of the soil plastic zone are investigated. The results show that (1) the radius of the soil plastic zone (rp) induced by lightning decreases monotonically with cohesion (c) and internal friction angle (φ), while c has a better effect on soil properties than φ does; (2) increasing the initial radius of the plasma channel (ri0) can reduce the pressure (P) and increasing ri0 has a nonnegligible effect on rp; with ri0 increasing by 100%, the radius of the soil plastic zone increases by 47.9–59.7%; (3) the plasma channel length (L) has a significant influence on P and rp, especially when L is at a relatively low level; (4) the rp induced by lightning decreases exponentially with attenuation coefficient (a); (5) the wavefront time is a major factor while the half-value time is a minor factor for the shock wave pressure induced by plasma explosives
    corecore