184 research outputs found

    Dynamic Global Memory for Document-level Argument Extraction

    Full text link
    Extracting informative arguments of events from news articles is a challenging problem in information extraction, which requires a global contextual understanding of each document. While recent work on document-level extraction has gone beyond single-sentence and increased the cross-sentence inference capability of end-to-end models, they are still restricted by certain input sequence length constraints and usually ignore the global context between events. To tackle this issue, we introduce a new global neural generation-based framework for document-level event argument extraction by constructing a document memory store to record the contextual event information and leveraging it to implicitly and explicitly help with decoding of arguments for later events. Empirical results show that our framework outperforms prior methods substantially and it is more robust to adversarially annotated examples with our constrained decoding design. (Our code and resources are available at https://github.com/xinyadu/memory_docie for research purpose.)Comment: ACL 2022 main conference (12 pages

    Physics Informed Reinforcement Learning for Power Grid Control using Augmented Random Search

    Get PDF
    Wide adoption of deep reinforcement learning in energy system domain needs to overcome several challenges , including scalability, learning from limited samples, and high-dimensional continuous state and action spaces. In this paper, we integrated physics-based information from the generator operation state formula, also known as Swing Equation, into the reinforcement learning agent's neural network loss function, and applied an augmented random search agent to optimize the generator control under dynamic contingency. Simulation results demonstrated the reliability performance improvements in training speed, reward convergence, and future potentials in its transferability and scalability

    FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models

    Full text link
    We introduce FAITHSCORE (Faithfulness to Atomic Image Facts Score), a reference-free and fine-grained evaluation metric that measures the faithfulness of the generated free-form answers from large vision-language models (LVLMs). The FAITHSCORE evaluation first identifies sub-sentences containing descriptive statements that need to be verified, then extracts a comprehensive list of atomic facts from these sub-sentences, and finally conducts consistency verification between fine-grained atomic facts and the input image. Meta-evaluation demonstrates that our metric highly correlates with human judgments of faithfulness. We collect two benchmark datasets (i.e. LLaVA-1k and MSCOCO-Cap) for evaluating LVLMs instruction-following hallucinations. We measure hallucinations in state-of-the-art LVLMs with FAITHSCORE on the datasets. Results reveal that current systems are prone to generate hallucinated content unfaithful to the image, which leaves room for future improvements. Further, we find that current LVLMs despite doing well on color and counting, still struggle with long answers, relations, and multiple objects

    Large Language Models for Automated Open-domain Scientific Hypotheses Discovery

    Full text link
    Hypothetical induction is recognized as the main reasoning type when scientists make observations about the world and try to propose hypotheses to explain those observations. Past research on hypothetical induction has a limited setting that (1) the observation annotations of the dataset are not raw web corpus but are manually selected sentences (resulting in a close-domain setting); and (2) the ground truth hypotheses annotations are mostly commonsense knowledge, making the task less challenging. In this work, we propose the first NLP dataset for social science academic hypotheses discovery, consisting of 50 recent papers published in top social science journals. Raw web corpora that are necessary for developing hypotheses in the published papers are also collected in the dataset, with the final goal of creating a system that automatically generates valid, novel, and helpful (to human researchers) hypotheses, given only a pile of raw web corpora. The new dataset can tackle the previous problems because it requires to (1) use raw web corpora as observations; and (2) propose hypotheses even new to humanity. A multi-module framework is developed for the task, as well as three different feedback mechanisms that empirically show performance gain over the base framework. Finally, our framework exhibits high performance in terms of both GPT-4 based evaluation and social science expert evaluation

    A Bi-Objective Fuzzy Credibilistic Chance-Constrained Programming Approach for the Hazardous Materials Road-Rail Multimodal Routing Problem under Uncertainty and Sustainability

    Get PDF
    Hazardous materials transportation involves extensive risk and cannot be avoided in practice. An advanced routing, however, can help to reduce the risk by planning the best transportation routes for hazardous materials that can make effective tradeoffs between the risk objective and the economic objective. In this study, we explore the hazardous materials routing problem in the road-rail multimodal transportation network with a hub-and-spoke structure, in which the risk is measured by the multiplication of population exposure and the associated volume of hazardous materials, and minimizing the total risk of all the transportation orders of hazardous materials is set as the risk objective. It is difficult to estimate the population exposure exactly during the routing decision-making process, which results in its uncertainty. In this study, we formulate the uncertain population exposure from a fuzzy programming perspective by using triangular fuzzy numbers. Moreover, the carbon dioxide emission constraint is formulated to realize the sustainable transportation of hazardous materials. To optimize the problem under the above framework, we first establish a bi-objective fuzzy mixed integer nonlinear programming model, and then develop a three-stage exact solution strategy that the combines fuzzy credibilistic chance constraint, linearization technique, and the normalized weighting method. Finally, a computational experiment is carried out to verify the feasibility of the proposed method in dealing with the problem. The experimental results indicate that tradeoffs between the two conflicting objectives can be effectively made by using the Pareto frontier to the hazardous materials routing problem. Furthermore, the credibility level and carbon dioxide emission cap significantly influence the hazardous materials routing optimization. Their effects on the optimization result are quantified by using sensitivity analysis, which can draw some useful insights to help decision makers to better organize the hazardous materials road-rail multimodal transportation under uncertainty and sustainability. Document type: Articl

    Language Models as Inductive Reasoners

    Full text link
    Inductive reasoning is a core component of human intelligence. In the past research of inductive reasoning within computer science, formal language is used as representations of knowledge (facts and rules, more specifically). However, formal language can cause systematic problems for inductive reasoning such as disability of handling raw input such as natural language, sensitiveness to mislabeled data, and incapacity to handle ambiguous input. To this end, we propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language. New automatic metrics are also proposed and analysed for the evaluation of this task. With DEER, we investigate a modern approach for inductive reasoning where we use natural language as representation for knowledge instead of formal language and use pretrained language models as ''reasoners''. Moreover, we provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts. We also propose a new framework drawing insights from philosophy literature for this task, which we show in the experiment section that surpasses baselines in both automatic and human evaluations. We discuss about our future perspectives for inductive reasoning in Section 7. Dataset and code are available at https://github.com/ZonglinY/Inductive_Reasoning.Comment: Accepted by EACL 202
    corecore