116 research outputs found

    Dynamic Learning of Sequential Choice Bandit Problem under Marketing Fatigue

    Full text link
    Motivated by the observation that overexposure to unwanted marketing activities leads to customer dissatisfaction, we consider a setting where a platform offers a sequence of messages to its users and is penalized when users abandon the platform due to marketing fatigue. We propose a novel sequential choice model to capture multiple interactions taking place between the platform and its user: Upon receiving a message, a user decides on one of the three actions: accept the message, skip and receive the next message, or abandon the platform. Based on user feedback, the platform dynamically learns users' abandonment distribution and their valuations of messages to determine the length of the sequence and the order of the messages, while maximizing the cumulative payoff over a horizon of length T. We refer to this online learning task as the sequential choice bandit problem. For the offline combinatorial optimization problem, we show that an efficient polynomial-time algorithm exists. For the online problem, we propose an algorithm that balances exploration and exploitation, and characterize its regret bound. Lastly, we demonstrate how to extend the model with user contexts to incorporate personalization

    Zero-Resource Hallucination Prevention for Large Language Models

    Full text link
    The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination," which refers to instances where LLMs generate factually inaccurate or ungrounded information. Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based methods that suffer from interpretability issues. Additionally, the methods that identify hallucinations post-generation could not prevent their occurrence and suffer from inconsistent performance due to the influence of the instruction format and model style. In this paper, we introduce a novel pre-detection self-evaluation technique, referred to as {\method}, which focuses on evaluating the model's familiarity with the concepts present in the input instruction and withholding the generation of response in case of unfamiliar concepts. This approach emulates the human ability to refrain from responding to unfamiliar topics, thus reducing hallucinations. We validate {\method} across four different large language models, demonstrating consistently superior performance compared to existing techniques. Our findings propose a significant shift towards preemptive strategies for hallucination mitigation in LLM assistants, promising improvements in reliability, applicability, and interpretability

    Fatigue-aware Bandits for Dependent Click Models

    Full text link
    As recommender systems send a massive amount of content to keep users engaged, users may experience fatigue which is contributed by 1) an overexposure to irrelevant content, 2) boredom from seeing too many similar recommendations. To address this problem, we consider an online learning setting where a platform learns a policy to recommend content that takes user fatigue into account. We propose an extension of the Dependent Click Model (DCM) to describe users' behavior. We stipulate that for each piece of content, its attractiveness to a user depends on its intrinsic relevance and a discount factor which measures how many similar contents have been shown. Users view the recommended content sequentially and click on the ones that they find attractive. Users may leave the platform at any time, and the probability of exiting is higher when they do not like the content. Based on user's feedback, the platform learns the relevance of the underlying content as well as the discounting effect due to content fatigue. We refer to this learning task as "fatigue-aware DCM Bandit" problem. We consider two learning scenarios depending on whether the discounting effect is known. For each scenario, we propose a learning algorithm which simultaneously explores and exploits, and characterize its regret bound

    A Benchmark Dataset for Understandable Medical Language Translation

    Full text link
    In this paper, we introduce MedLane -- a new human-annotated Medical Language translation dataset, to align professional medical sentences with layperson-understandable expressions. The dataset contains 12,801 training samples, 1,015 validation samples, and 1,016 testing samples. We then evaluate one naive and six deep learning-based approaches on the MedLane dataset, including directly copying, a statistical machine translation approach Moses, four neural machine translation approaches (i.e., the proposed PMBERT-MT model, Seq2Seq and its two variants), and a modified text summarization model PointerNet. To compare the results, we utilize eleven metrics, including three new measures specifically designed for this task. Finally, we discuss the limitations of MedLane and baselines, and point out possible research directions for this task

    Microwave-induced phase escape in a Josephson tunnel junction

    Get PDF
    This is the published version, also available here: http://dx.doi.org/10.1103/PhysRevB.77.104531.We perform both theoretical and experimental investigations on the phase escape of a current-biased Josephson tunnel junction under microwave irradiation. The switching current distributions exhibit abundant nonlinear behaviors depending on the power and frequency of the applied microwave. We present a model to describe the behavior of the primary peak in the switching current distribution, which is confirmed by our experimental results. The obtained features can be used to characterize the damping parameter of Josephson junctions
    • …
    corecore