184 research outputs found
Dynamic Global Memory for Document-level Argument Extraction
Extracting informative arguments of events from news articles is a
challenging problem in information extraction, which requires a global
contextual understanding of each document. While recent work on document-level
extraction has gone beyond single-sentence and increased the cross-sentence
inference capability of end-to-end models, they are still restricted by certain
input sequence length constraints and usually ignore the global context between
events. To tackle this issue, we introduce a new global neural generation-based
framework for document-level event argument extraction by constructing a
document memory store to record the contextual event information and leveraging
it to implicitly and explicitly help with decoding of arguments for later
events. Empirical results show that our framework outperforms prior methods
substantially and it is more robust to adversarially annotated examples with
our constrained decoding design. (Our code and resources are available at
https://github.com/xinyadu/memory_docie for research purpose.)Comment: ACL 2022 main conference (12 pages
Physics Informed Reinforcement Learning for Power Grid Control using Augmented Random Search
Wide adoption of deep reinforcement learning in energy system domain needs to overcome several challenges , including scalability, learning from limited samples, and high-dimensional continuous state and action spaces. In this paper, we integrated physics-based information from the generator operation state formula, also known as Swing Equation, into the reinforcement learning agent's neural network loss function, and applied an augmented random search agent to optimize the generator control under dynamic contingency. Simulation results demonstrated the reliability performance improvements in training speed, reward convergence, and future potentials in its transferability and scalability
FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models
We introduce FAITHSCORE (Faithfulness to Atomic Image Facts Score), a
reference-free and fine-grained evaluation metric that measures the
faithfulness of the generated free-form answers from large vision-language
models (LVLMs). The FAITHSCORE evaluation first identifies sub-sentences
containing descriptive statements that need to be verified, then extracts a
comprehensive list of atomic facts from these sub-sentences, and finally
conducts consistency verification between fine-grained atomic facts and the
input image. Meta-evaluation demonstrates that our metric highly correlates
with human judgments of faithfulness. We collect two benchmark datasets (i.e.
LLaVA-1k and MSCOCO-Cap) for evaluating LVLMs instruction-following
hallucinations. We measure hallucinations in state-of-the-art LVLMs with
FAITHSCORE on the datasets. Results reveal that current systems are prone to
generate hallucinated content unfaithful to the image, which leaves room for
future improvements. Further, we find that current LVLMs despite doing well on
color and counting, still struggle with long answers, relations, and multiple
objects
Large Language Models for Automated Open-domain Scientific Hypotheses Discovery
Hypothetical induction is recognized as the main reasoning type when
scientists make observations about the world and try to propose hypotheses to
explain those observations. Past research on hypothetical induction has a
limited setting that (1) the observation annotations of the dataset are not raw
web corpus but are manually selected sentences (resulting in a close-domain
setting); and (2) the ground truth hypotheses annotations are mostly
commonsense knowledge, making the task less challenging. In this work, we
propose the first NLP dataset for social science academic hypotheses discovery,
consisting of 50 recent papers published in top social science journals. Raw
web corpora that are necessary for developing hypotheses in the published
papers are also collected in the dataset, with the final goal of creating a
system that automatically generates valid, novel, and helpful (to human
researchers) hypotheses, given only a pile of raw web corpora. The new dataset
can tackle the previous problems because it requires to (1) use raw web corpora
as observations; and (2) propose hypotheses even new to humanity. A
multi-module framework is developed for the task, as well as three different
feedback mechanisms that empirically show performance gain over the base
framework. Finally, our framework exhibits high performance in terms of both
GPT-4 based evaluation and social science expert evaluation
A Bi-Objective Fuzzy Credibilistic Chance-Constrained Programming Approach for the Hazardous Materials Road-Rail Multimodal Routing Problem under Uncertainty and Sustainability
Hazardous materials transportation involves extensive risk and cannot be avoided in practice. An advanced routing, however, can help to reduce the risk by planning the best transportation routes for hazardous materials that can make effective tradeoffs between the risk objective and the economic objective. In this study, we explore the hazardous materials routing problem in the road-rail multimodal transportation network with a hub-and-spoke structure, in which the risk is measured by the multiplication of population exposure and the associated volume of hazardous materials, and minimizing the total risk of all the transportation orders of hazardous materials is set as the risk objective. It is difficult to estimate the population exposure exactly during the routing decision-making process, which results in its uncertainty. In this study, we formulate the uncertain population exposure from a fuzzy programming perspective by using triangular fuzzy numbers. Moreover, the carbon dioxide emission constraint is formulated to realize the sustainable transportation of hazardous materials. To optimize the problem under the above framework, we first establish a bi-objective fuzzy mixed integer nonlinear programming model, and then develop a three-stage exact solution strategy that the combines fuzzy credibilistic chance constraint, linearization technique, and the normalized weighting method. Finally, a computational experiment is carried out to verify the feasibility of the proposed method in dealing with the problem. The experimental results indicate that tradeoffs between the two conflicting objectives can be effectively made by using the Pareto frontier to the hazardous materials routing problem. Furthermore, the credibility level and carbon dioxide emission cap significantly influence the hazardous materials routing optimization. Their effects on the optimization result are quantified by using sensitivity analysis, which can draw some useful insights to help decision makers to better organize the hazardous materials road-rail multimodal transportation under uncertainty and sustainability.
Document type: Articl
Language Models as Inductive Reasoners
Inductive reasoning is a core component of human intelligence. In the past
research of inductive reasoning within computer science, formal language is
used as representations of knowledge (facts and rules, more specifically).
However, formal language can cause systematic problems for inductive reasoning
such as disability of handling raw input such as natural language,
sensitiveness to mislabeled data, and incapacity to handle ambiguous input. To
this end, we propose a new paradigm (task) for inductive reasoning, which is to
induce natural language rules from natural language facts, and create a dataset
termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts
are written in natural language. New automatic metrics are also proposed and
analysed for the evaluation of this task. With DEER, we investigate a modern
approach for inductive reasoning where we use natural language as
representation for knowledge instead of formal language and use pretrained
language models as ''reasoners''. Moreover, we provide the first and
comprehensive analysis of how well pretrained language models can induce
natural language rules from natural language facts. We also propose a new
framework drawing insights from philosophy literature for this task, which we
show in the experiment section that surpasses baselines in both automatic and
human evaluations. We discuss about our future perspectives for inductive
reasoning in Section 7. Dataset and code are available at
https://github.com/ZonglinY/Inductive_Reasoning.Comment: Accepted by EACL 202
- …