457 research outputs found

    Extending LLMs' Context Window with 100 Samples

    Full text link
    Large Language Models (LLMs) are known to have limited extrapolation ability beyond their pre-trained context window, constraining their application in downstream tasks with lengthy inputs. Recent studies have sought to extend LLMs' context window by modifying rotary position embedding (RoPE), a popular position encoding method adopted by well-known LLMs such as LLaMA, PaLM, and GPT-NeoX. However, prior works like Position Interpolation (PI) and YaRN are resource-intensive and lack comparative experiments to assess their applicability. In this work, we identify the inherent need for LLMs' attention entropy (i.e. the information entropy of attention scores) to maintain stability and introduce a novel extension to RoPE which combines adjusting RoPE's base frequency and scaling the attention logits to help LLMs efficiently adapt to a larger context window. We validate the superiority of our method in both fine-tuning performance and robustness across different context window sizes on various context-demanding tasks. Notably, our method extends the context window of LLaMA-2-7B-Chat to 16,384 with only 100 samples and 6 training steps, showcasing extraordinary efficiency. Finally, we also explore how data compositions and training curricula affect context window extension for specific downstream tasks, suggesting fine-tuning LLMs with lengthy conversations as a good starting point. We release our code and SFT data at https://github.com/GAIR-NLP/Entropy-ABF

    Compact broadband circularly-polarised antenna with a backed cavity for UHF RFID applications

    Get PDF

    Self-Prompting Large Language Models for Zero-Shot Open-Domain QA

    Full text link
    Open-Domain Question Answering (ODQA) aims to answer questions without explicitly providing specific background documents. This task becomes notably challenging in a zero-shot setting where no data is available to train tailored retrieval-reader models. While recent Large Language Models (LLMs) like GPT-3 have demonstrated their effectiveness in zero-shot ODQA using direct prompting methods, these methods still fall short of fully harnessing the potential of LLMs when implicitly invoked. In this paper, we propose a Self-Prompting framework to explicitly utilize the massive knowledge encoded in the parameters of LLMs and their strong instruction understanding abilities. Concretely, we prompt LLMs step by step to generate multiple pseudo QA pairs with background passages and explanations entirely from scratch. These generated elements are then utilized for in-context learning. Experimental results show that our method significantly surpasses previous state-of-the-art zero-shot methods on three widely-used ODQA datasets and even achieves comparable performance with various customized fine-tuned models on full training data. Our code is available at https://github.com/lockon-n/self-prompting.Comment: NAACL 202

    Government regulation of emergency supplies under the epidemic crisis

    Get PDF
    This paper constructs a multi-oligopoly model of emergency supplies and analyses the market equilibrium results under normal conditions and epidemic conditions. The impacts of the degree of change in market demand, externalities, the material cost of emergency supplies and government regulation on the equilibrium results, especially on the prices of emergency supplies, are discussed. The results show that an increase in material cost will lead to low output and social welfare and a high price, under either normal conditions or epidemic conditions. Moreover, under epidemic conditions, the degree of change in market demand, externalities, material cost and the presence and mode of government regulation all have multiple and complex influences on the equilibrium results. Under epidemic conditions, both government output and price regulation can increase the supply of emergency supplies. In addition, when market demand changes drastically, consumer surplus and social welfare can be enhanced by the implementation of regulations. Particularly, price regulation is more effective when there is a high material cost

    Further Development of the Improved QMD Model and its Applications to Fusion Reaction near Barrier

    Full text link
    The Improved Quantum Molecular Dynamics model is further developed by introducing new parameters in interaction potential energy functional based on Skyrme interaction of SkM^{*} and SLy series. The properties of ground states of selected nuclei can be reproduced very well. The Coulomb barriers for a series of reaction systems are studied and compared with the results of the proximity potential. The fusion excitation functions for a series of fusion reactions are calculated and the results are in good agreement with experimental data.Comment: 17 pages, 10 figures, PRC accepte

    Spatial variation of perceived equity and its determinants in a gateway community of Giant Panda National Park, China

    Get PDF
    Unidad de excelencia María de Maeztu CEX2019-000940-MSocial equity is essential in the governance of protected areas (PAs), as ignoring such consideration can lead to resistance and jeopardize conservation objectives. However, more research is required to understand the spatial heterogeneity of perceived social equity and its underlying spatial factors. Using a survey of 361 respondents, we presented spatial distribution patterns of perceived equity by kernel density estimation (KDE) in Giant Panda National Park, China. The regression analysis showed that local residents who live closer to the PA boundary are more likely to develop negative responses and those who with easy access to tourism spots have more positive procedural and distributional perceptions. Notably, the proximity to the PA authority decreases locals' perceptions of fairness in all aspects, which is potentially due to the opaque participative channels provided by the PA authority. We argue that those spatial differentials in fairness perceptions are driven by the intrinsic discrepancy of biodiversity protection requirements and the unevenly distributed consequences of management policies. Key steps to advance social equity considerations include multi-industry guidance, extending participative channels, and co-producing better compensation plans. Herein, this study appeals to a greater focus on the spatial aspect of social equity issues in PAs

    Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation

    Full text link
    Pre-trained Language Models (PrLMs) have been widely used as backbones in lots of Natural Language Processing (NLP) tasks. The common process of utilizing PrLMs is first pre-training on large-scale general corpora with task-independent LM training objectives, then fine-tuning on task datasets with task-specific training objectives. Pre-training in a task-independent way enables the models to learn language representations, which is universal to some extent, but fails to capture crucial task-specific features in the meantime. This will lead to an incompatibility between pre-training and fine-tuning. To address this issue, we introduce task-specific pre-training on in-domain task-related corpora with task-specific objectives. This procedure is placed between the original two stages to enhance the model understanding capacity of specific tasks. In this work, we focus on Dialogue-related Natural Language Processing (DrNLP) tasks and design a Dialogue-Adaptive Pre-training Objective (DAPO) based on some important qualities for assessing dialogues which are usually ignored by general LM pre-training objectives. PrLMs with DAPO on a large in-domain dialogue corpus are then fine-tuned for downstream DrNLP tasks. Experimental results show that models with DAPO surpass those with general LM pre-training objectives and other strong baselines on downstream DrNLP tasks
    corecore