319 research outputs found

    ECHo: A Visio-Linguistic Dataset for Event Causality Inference via Human-Centric Reasoning

    Full text link
    We introduce ECHo (Event Causality Inference via Human-Centric Reasoning), a diagnostic dataset of event causality inference grounded in visio-linguistic social scenarios. ECHo employs real-world human-centric deductive information building on a television crime drama. ECHo requires the Theory-of-Mind (ToM) ability to understand and reason about social interactions based on multimodal information. Using ECHo, we propose a unified Chain-of-Thought (CoT) framework to assess the reasoning capability of current AI systems. Our ToM-enhanced CoT pipeline accommodates various large foundation models in both zero-shot and few-shot visio-linguistic reasoning. We use this framework to scrutinize recent large foundation models such as InstructGPT and MiniGPT-4 on three diagnostic human-centric tasks. Further analysis demonstrates ECHo as a challenging dataset to expose imperfections and inconsistencies in reasoning. Our data and code are publicly available at https://github.com/YuxiXie/ECHo.Comment: Findings of EMNLP 2023. 10 pages, 6 figures, 5 tables (22 pages, 8 figures, 15 tables including references and appendices

    Automatic Model Selection with Large Language Models for Reasoning

    Full text link
    Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths. CoT employs natural language, offering flexibility and interpretability, while PAL utilizes programming language, yielding more structured and rigorous logic. We introduce a model selection method to combine the best of both worlds by employing a large language model (LLM) to dynamically select between them. Our theoretical analysis underscores the feasibility of this method, which is further corroborated by empirical results. Our proposed method demonstrates significant performance improvements across eight reasoning datasets with Codex, ChatGPT, and GPT-4. Additionally, our method is complementary to self-consistency; when integrated, it can further enhance performance while significantly reducing computation costs. Moreover, we achieve new state-of-the-art results on GSM8K and SVAMP, with respective accuracies of 96.8% and 93.7%. Our code, data and prompts are available at https://github.com/XuZhao0/Model-Selection-ReasoningComment: EMNLP 2023 Finding

    Productive Aging Conference Report

    Get PDF
    Productive Aging Conference Repor

    Photocatalytic Removal of Organics over BiVO4-Based Photocatalysts

    Get PDF
    Organic compounds, such as organic dyes and phenols, are the main pollutants in wastewater. In the past years, a large number of studies on the fabrication and photocatalytic organics degradation of BiVO4 and its related materials have been reported in the literature. In this chapter, we shall focus on the advancements in the synthesis and photocatalytic applications of several kinds of BiVO4-based photocatalysts: (i) well-defined morphological BiVO4 photocatalysts, (ii) porous BiVO4 photocatalysts, (iii) heteroatom-doped BiVO4 photocatalysts, (iv) BiVO4-based heterojunction photocatalysts, and (v) supported BiVO4 photocatalysts. We shall discuss the structure–photocatalytic performance relationship of the materials and the involved photocatalytic degradation mechanisms. In addition, we also propose the research trends and technologies for practical applications of the BiVO4-based photocatalytic materials

    Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding

    Full text link
    We endow Large Language Models (LLMs) with fine-grained self-evaluation to refine multi-step reasoning inference. We propose an effective prompting approach that integrates self-evaluation guidance through stochastic beam search. Our approach explores the reasoning search space using a well-calibrated automatic criterion. This enables an efficient search to produce higher-quality final predictions. With the self-evaluation guided stochastic beam search, we also balance the quality-diversity trade-off in the generation of reasoning chains. This allows our approach to adapt well with majority voting and surpass the corresponding Codex-backboned baselines by 6.34%6.34\%, 9.56%9.56\%, and 5.46%5.46\% on the GSM8K, AQuA, and StrategyQA benchmarks, respectively, in few-shot accuracy. Analysis of our decompositional reasoning finds it pinpoints logic failures and leads to higher consistency and robustness. Our code is publicly available at https://github.com/YuxiXie/SelfEval-Guided-Decoding.Comment: Our code is publicly available at https://github.com/YuxiXie/SelfEval-Guided-Decodin

    InstructCoder: Empowering Language Models for Code Editing

    Full text link
    Code editing encompasses a variety of pragmatic tasks that developers deal with daily. Despite its relevance and practical usefulness, automatic code editing remains an underexplored area in the evolution of deep learning models, partly due to data scarcity. In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring. To facilitate this, we introduce InstructCoder, the first dataset designed to adapt LLMs for general-purpose code editing, containing highdiversity code-editing tasks. It consists of over 114,000 instruction-input-output triplets and covers multiple distinct code editing scenarios. The dataset is systematically expanded through an iterative process that commences with code editing data sourced from GitHub commits as seed tasks. Seed and generated tasks are used subsequently to prompt ChatGPT for more task data. Our experiments demonstrate that open-source LLMs fine-tuned on InstructCoder can edit code correctly based on users' instructions most of the time, exhibiting unprecedented code-editing performance levels. Such results suggest that proficient instruction-finetuning can lead to significant amelioration in code editing abilities. The dataset and the source code are available at https://github.com/qishenghu/CodeInstruct
    • …
    corecore