4,777 research outputs found

    Debye formulas for a relaxing system with memory

    Get PDF
    Rate (master) equations are ubiquitous in statistical physics, yet, to the best of our knowledge, a rate equation with memory has previously never been considered. We write down an integro-differential rate equation for the evolution of a thermally relaxing system with memory. For concreteness we adopt as a model a single-domain magnetic particle driven by a small ac field and derive the modified Debye formulas. For any memory time Θ the in-phase component of the resultant ac susceptibility is positive at small probing frequencies ω, but becomes negative at large ω. The system thus exhibits frequency induced diamagnetism. For comparison we also consider particle pairs with dipolar coupling. The memory effect is found to be enhanced by ferromagnetic coupling and suppressed by antiferromagnetic coupling. Numerical calculations support the prediction of a negative susceptibility which arises from a phase shift induced by the memory effect. It is proposed that the onset of frequency induced diamagnetism represents a viable experimental signature of correlated noise

    Bridging the Gap between Different Vocabularies for LLM Ensemble

    Full text link
    Ensembling different large language models (LLMs) to unleash their complementary potential and harness their individual strengths is highly valuable. Nevertheless, vocabulary discrepancies among various LLMs have constrained previous studies to either selecting or blending completely generated outputs. This limitation hinders the dynamic correction and enhancement of outputs during the generation process, resulting in a limited capacity for effective ensemble. To address this issue, we propose a novel method to Ensemble LLMs via Vocabulary Alignment (EVA). EVA bridges the lexical gap among various LLMs, enabling meticulous ensemble at each generation step. Specifically, we first learn mappings between the vocabularies of different LLMs with the assistance of overlapping tokens. Subsequently, these mappings are employed to project output distributions of LLMs into a unified space, facilitating a fine-grained ensemble. Finally, we design a filtering strategy to exclude models that generate unfaithful tokens. Experimental results on commonsense reasoning, arithmetic reasoning, machine translation, and data-to-text generation tasks demonstrate the superiority of our approach compared with individual LLMs and previous ensemble methods conducted on complete outputs. Further analyses confirm that our approach can leverage knowledge from different language models and yield consistent improvement.Comment: Accepted to the main conference of NAACL 202

    Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

    Full text link
    Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks. With only a few demonstration examples, these LLMs can quickly adapt to target tasks without expensive gradient updates. Common strategies to boost such 'in-context' learning ability are to ensemble multiple model decoded results and require the model to generate an explanation along with the prediction. However, these models often treat different class predictions equally and neglect the potential discrepancy between the explanations and predictions. To fully unleash the power of explanations, we propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs. We design two techniques, explanation-guided ensemble, and soft probability aggregation, to mitigate the effect of unreliable explanations and improve the consistency between explanations and final predictions. Experiments on seven natural language understanding tasks and four varying-size LLMs demonstrate the effectiveness of our proposed framework

    The Egyptian, June 24, 1936

    Get PDF

    LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value Extraction

    Full text link
    Product attribute value extraction is a pivotal component in Natural Language Processing (NLP) and the contemporary e-commerce industry. The provision of precise product attribute values is fundamental in ensuring high-quality recommendations and enhancing customer satisfaction. The recently emerging Large Language Models (LLMs) have demonstrated state-of-the-art performance in numerous attribute extraction tasks, without the need for domain-specific training data. Nevertheless, varying strengths and weaknesses are exhibited by different LLMs due to the diversity in data, architectures, and hyperparameters. This variation makes them complementary to each other, with no single LLM dominating all others. Considering the diverse strengths and weaknesses of LLMs, it becomes necessary to develop an ensemble method that leverages their complementary potentials. In this paper, we propose a novel algorithm called LLM-ensemble to ensemble different LLMs' outputs for attribute value extraction. We iteratively learn the weights for different LLMs to aggregate the labels with weights to predict the final attribute value. Not only can our proposed method be proven theoretically optimal, but it also ensures efficient computation, fast convergence, and safe deployment. We have also conducted extensive experiments with various state-of-the-art LLMs, including Llama2-13B, Llama2-70B, PaLM-2, GPT-3.5, and GPT-4, on Walmart's internal data. Our offline metrics demonstrate that the LLM-ensemble method outperforms all the state-of-the-art single LLMs on Walmart's internal dataset. This method has been launched in several production models, leading to improved Gross Merchandise Volume (GMV), Click-Through Rate (CTR), Conversion Rate (CVR), and Add-to-Cart Rate (ATC).Comment: SIGIR 2024 industry trac

    Experimental and Theoretical Study on a Transient, Turbulent Free Hydrogen Gas Jet Issuing into Still Air

    Get PDF
    Distributions of hydrogen gas concentration in a suddenly started, single shot hydrogen gas jet issuing from a 1 mm diameter injector into still air were measured using laser interferometry method. This unsteady, turbulent free jet flow has also been calculated using the two-equation, high Reynolds number version of k-ε turbulence model and hybrid scheme for treating combined diffusion and convection in the SIMPLE algorithm. The injection pressure was 0.5 MPa for which predicted and measured temporal jet tip penetration distributions indicate that the jet discharged into still air at Mach 0.25. The level of agreement between present prediction and measurement is good in some regions and poor in others

    FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs

    Full text link
    Training large language models (LLMs) is a costly endeavour in terms of time and computational resources. The large amount of training data used during the unsupervised pre-training phase makes it difficult to verify all data and, unfortunately, undesirable data may be ingested during training. Re-training from scratch is impractical and has led to the creation of the 'unlearning' discipline where models are modified to "unlearn" undesirable information without retraining. However, any modification can alter the behaviour of LLMs, especially on key dimensions such as fairness. This is the first work that examines this interplay between unlearning and fairness for LLMs. In particular, we focus on a popular unlearning framework known as SISA [Bourtoule et al., 2021], which creates an ensemble of models trained on disjoint shards. We evaluate the performance-fairness trade-off for SISA, and empirically demsontrate that SISA can indeed reduce fairness in LLMs. To remedy this, we propose post-processing bias mitigation techniques for ensemble models produced by SISA. We adapt the post-processing fairness improvement technique from [Hardt et al., 2016] to design three methods that can handle model ensembles, and prove that one of the methods is an optimal fair predictor for ensemble of models. Through experimental results, we demonstrate the efficacy of our post-processing framework called 'FairSISA'.Comment: Accepted in NeurIPS 2023 Workshop on Socially Responsible Language Modelling Research (SoLaR
    corecore