4 research outputs found

    Automatic Neural Question Generation using Community-based Question Answering Systems

    Get PDF
    In this thesis, we address the problem of opinion question generation. The motivation behind this task is to provide users with more question samples related to their query when using search engines. In our view, one of the datasets that is closest to peoples’ thoughts, informal and casual speech are Community Question Answering (CQA) forums, where one can post questions, and other users can answer them. Specifically, we perform experiments on the Amazon question/answer dataset. Unlike the conventional approaches that have tackled the question generation problem with hand-crafted rules, our approach is entirely data-driven. We model our problem with the sequence to sequence approach using an encoder-decoder structure, which has shown significant improvement in different natural language processing research areas in recent years. Our model benefits from the attention mechanism, which assists the model in fo- cusing on a specific part of the input sentence. Furthermore, we provide solutions to the following problems: repetition of words and generating outside of vocabulary tokens. We provide a detailed explanation of the performance of the system. Experimental results show an improvement in automatic evaluation metrics such as the BLEU score over the state-of- the-art question generation system

    PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

    Full text link
    Powerful large language models have facilitated the development of writing assistants that promise to significantly improve the quality and efficiency of composition and communication. However, a barrier to effective assistance is the lack of personalization in LLM outputs to the author's communication style and specialized knowledge. In this paper, we address this challenge by proposing PEARL, a retrieval-augmented LLM writing assistant personalized with a generation-calibrated retriever. Our retriever is trained to select historic user-authored documents for prompt augmentation, such that they are likely to best personalize LLM generations for a user request. We propose two key novelties for training our retriever: 1) A training data selection method that identifies user requests likely to benefit from personalization and documents that provide that benefit; and 2) A scale-calibrating KL-divergence objective that ensures that our retriever closely tracks the benefit of a document for personalized generation. We demonstrate the effectiveness of PEARL in generating personalized workplace social media posts and Reddit comments. Finally, we showcase the potential of a generation-calibrated retriever to double as a performance predictor and further improve low-quality generations via LLM chaining.Comment: Pre-print, work in progres
    corecore