30 research outputs found

    CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training

    Full text link
    Speech or text representation generated by pre-trained models contains modal-specific information that could be combined for benefiting spoken language understanding (SLU) tasks. In this work, we propose a novel pre-training paradigm termed Continuous Integrate-and-Fire Pre-Training (CIF-PT). It relies on a simple but effective frame-to-token alignment: continuous integrate-and-fire (CIF) to bridge the representations between speech and text. It jointly performs speech-to-text training and language model distillation through CIF as the pre-training (PT). Evaluated on SLU benchmark SLURP dataset, CIF-PT outperforms the state-of-the-art model by 1.94% of accuracy and 2.71% of SLU-F1 on the tasks of intent classification and slot filling, respectively. We also observe the cross-modal representation extracted by CIF-PT obtains better performance than other neural interfaces for the tasks of SLU, including the dominant speech representation learned from self-supervised pre-training.Comment: Accepted by ACL 2023 Finding

    Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

    Full text link
    Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents.~\footnote{Working in progress.}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}Comment: Working in progres

    An effective route to the additive manufacturing of a mechanically gradient supramolecular polymer nanocomposite structure

    Get PDF
    3D Printing techniques are additive methods of fabricating parts directly from computer-aided designs. Whilst the clearest benefit is the realisation of geometrical freedom, multi-material printing allows the introduction of compositional variation and highly tailored product functionality. The paper reports a proof-of-concept additive manufacturing study to deposit a supramolecular polymer and a complementary organic filler to form composites with gradient composition to enable spatial distribution of mechanical properties and functionality by tuning the number of supramolecular interactions. We use a dual-feed extrusion 3D printing process, with feed stocks based on the supramolecular polymer and its organic composite, delivered at ratios predetermined. This allows for production of a graded specimen with varying filler concentration that dictates the mechanical properties. The printed specimen was inspected under dynamic load in a tensile test using digital image correlation to produce full-field deformation maps, which showed clear differences in deformation in regions with varying compositions, corresponding to the designed-in variations. This approach affords a novel method for printing material with graded mechanical properties which are not currently commercially available or easily accessible, however, the method can potentially be directly translated to the generation of biomaterial-based composites featuring gradients of mechanical properties

    A review of hybrid wave-tidal energy conversion technology

    Get PDF
    Ocean renewable energy, such as wave and tidal energies, is important for energy supply and decarbonization of offshore platforms and ships. However, the intermittent and non-dispatchable nature of wave and tidal energy remains a significant challenge. The hybrid wave-tidal energy conversion presents a potential solution to enhance output power and stability by leveraging their complementary characteristics. This paper reviews the current state of hybrid wave-tidal energy conversion technology, focusing on device design, modeling methods and testing methods. Many current hybrid wave-tidal energy converters (HWTEC) have not considered effective coupling among the modules of sub-systems to maximize efficiency. Modeling and simulation methods are mainly based on studies of single wave or tidal energy conversion, and non-linear system modeling is rare. The assumption of continuous functions in most models can lead to discrepancies from real-life conditions. Advancement of modeling approaches, co-simulation algorithms, and dedicated dry lab and pool tests for HWTEC are desired. While HWTEC is potential for improving ocean energy conversion, addressing key challenges of device design, modeling, testing and economic evaluation are essential for realizing its full potential in contributing to decarbonization

    Fault Identification of Electric Submersible Pumps Based on Unsupervised and Multi-Source Transfer Learning Integration

    No full text
    The ratio between normal data and fault data generated by electric submersible pumps (ESPs) in production is prone to imbalance, and the information carried by the fault data generally as a minority sample is easily overwritten by the normal data as a majority sample, which seriously interferes with the fault identification effect. For the problem that data imbalance under different working conditions of ESPs causes the failure data to not be effectively identified, a fault identification method of ESPs based on unsupervised feature extraction integrated with migration learning was proposed. Firstly, new features were extracted from the data using multiple unsupervised methods to enhance the representational power of the data. Secondly, multiple samples of the source domain were obtained by multiple random sampling of the training set to fully train minority samples. Thirdly, the variation between the source domain and target domain was reduced by combining weighted balanced distribution adaptation (W-BDA). Finally, several basic learners were constructed and combined to integrate a stronger classifier to accomplish the ESP fault identification tasks. Compared with other fault identification methods, our method not only effectively enhances the performance of fault data features and improves the identification of a few fault data, but also copes with fault identification under different working conditions

    Improving Contextual Representation with Gloss Regularized Pre-training

    Full text link
    Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) encounter the discrepancy between pre-training and inference. In light of this gap, we investigate the contextual representation of pre-training and inference from the perspective of word probability distribution. We discover that BERT risks neglecting the contextual word similarity in pre-training. To tackle this issue, we propose an auxiliary gloss regularizer module to BERT pre-training (GR-BERT), to enhance word semantic similarity. By predicting masked words and aligning contextual embeddings to corresponding glosses simultaneously, the word similarity can be explicitly modeled. We design two architectures for GR-BERT and evaluate our model in downstream tasks. Experimental results show that the gloss regularizer benefits BERT in word-level and sentence-level semantic representation. The GR-BERT achieves new state-of-the-art in lexical substitution task and greatly promotes BERT sentence representation in both unsupervised and supervised STS tasks.Comment: Accepted to Findings of NAACL 202

    Fault Identification of Electric Submersible Pumps Based on Unsupervised and Multi-Source Transfer Learning Integration

    No full text
    The ratio between normal data and fault data generated by electric submersible pumps (ESPs) in production is prone to imbalance, and the information carried by the fault data generally as a minority sample is easily overwritten by the normal data as a majority sample, which seriously interferes with the fault identification effect. For the problem that data imbalance under different working conditions of ESPs causes the failure data to not be effectively identified, a fault identification method of ESPs based on unsupervised feature extraction integrated with migration learning was proposed. Firstly, new features were extracted from the data using multiple unsupervised methods to enhance the representational power of the data. Secondly, multiple samples of the source domain were obtained by multiple random sampling of the training set to fully train minority samples. Thirdly, the variation between the source domain and target domain was reduced by combining weighted balanced distribution adaptation (W-BDA). Finally, several basic learners were constructed and combined to integrate a stronger classifier to accomplish the ESP fault identification tasks. Compared with other fault identification methods, our method not only effectively enhances the performance of fault data features and improves the identification of a few fault data, but also copes with fault identification under different working conditions
    corecore