6,647 research outputs found
Automatic Extraction of Commonsense LocatedNear Knowledge
LocatedNear relation is a kind of commonsense knowledge describing two
physical objects that are typically found near each other in real life. In this
paper, we study how to automatically extract such relationship through a
sentence-level relation classifier and aggregating the scores of entity pairs
from a large corpus. Also, we release two benchmark datasets for evaluation and
future research.Comment: Accepted by ACL 2018. A preliminary version is presented on
AKBC@NIPS'1
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation
Open-domain code generation aims to generate code in a general-purpose
programming language (such as Python) from natural language (NL) intents.
Motivated by the intuition that developers usually retrieve resources on the
web when writing code, we explore the effectiveness of incorporating two
varieties of external knowledge into NL-to-code generation: automatically mined
NL-code pairs from the online programming QA forum StackOverflow and
programming language API documentation. Our evaluations show that combining the
two sources with data augmentation and retrieval-based data re-sampling
improves the current state-of-the-art by up to 2.2% absolute BLEU score on the
code generation testbed CoNaLa. The code and resources are available at
https://github.com/neulab/external-knowledge-codegen.Comment: Accepted by ACL 202
Hierarchical Prompting Assists Large Language Model on Web Navigation
Large language models (LLMs) struggle on processing complicated observations
in interactive decision making tasks. To alleviate this issue, we propose a
simple hierarchical prompting approach. Diverging from previous prompting
approaches that always put the full observation (e.g. a web page) to the
prompt, we propose to first construct an action-aware observation which is more
condensed and relevant with a dedicated SUMMARIZER prompt. The ACTOR prompt
then predicts the next action based on the summarized observation. While our
method has broad applicability, we particularly demonstrate its efficacy in the
complex domain of web navigation where a full observation often contains
redundant and irrelevant information. Our approach outperforms the previous
state-of-the-art prompting mechanics by 6.2% on task success rate,
demonstrating its potential on interactive decision making tasks with long
observation traces.Comment: EMNLP 2023 Findings; Natural Language Reasoning and Structured
Explanations Workshop at ACL 202
Elimination of the numerical Cerenkov instability for spectral EM-PIC codes
When using an electromagnetic particle-in-cell (EM-PIC) code to simulate a
relativistically drifting plasma, a violent numerical instability known as the
numerical Cerenkov instability (NCI) occurs. The NCI is due to the unphysical
coupling of electromagnetic waves on a grid to wave-particle resonances,
including aliased resonances, i.e., , where and refer to the time and space
aliases and the plasma is drifting relativistically at velocity in the
-direction. Recent studies have shown that an EM-PIC code which uses a
spectral field solver and a low pass filter can eliminate the fastest growing
modes of the NCI. Based on these studies a new spectral PIC code for studying
laser wakefield acceleration (LWFA) in the Lorentz boosted frame was developed.
However, we show that for parameters of relevance for LWFA simulations in the
boosted frame, a relativistically drifting plasma is susceptible to a host of
additional unstable modes with lower growth rates, and that these modes appear
when the fastest growing unstable modes are filtered out. We show that these
modes are most easily identified as the coupling between modes which are purely
transverse (EM) and purely longitudinal (Langmuir) in the rest frame of the
plasma for specific time and space aliases. We rewrite the dispersion relation
of the drifting plasma for a general field solver and obtain analytic
expressions for the location and growth rate for each unstable mode, i.e, for
each time and space aliased resonances. We show for the spectral solver that
when the fastest growing mode is eliminated a new mode at the fundamental
resonance () can be seen. (Please check the whole abstract in the
paper).Comment: 36 pages, 12 figure
HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories
GitHub has become an important platform for code sharing and scientific
exchange. With the massive number of repositories available, there is a
pressing need for topic-based search. Even though the topic label functionality
has been introduced, the majority of GitHub repositories do not have any
labels, impeding the utility of search and topic-based analysis. This work
targets the automatic repository classification problem as keyword-driven
hierarchical classification. Specifically, users only need to provide a label
hierarchy with keywords to supply as supervision. This setting is flexible,
adaptive to the users' needs, accounts for the different granularity of topic
labels and requires minimal human effort. We identify three key challenges of
this problem, namely (1) the presence of multi-modal signals; (2) supervision
scarcity and bias; (3) supervision format mismatch. In recognition of these
challenges, we propose the HiGitClass framework, comprising of three modules:
heterogeneous information network embedding; keyword enrichment; topic modeling
and pseudo document generation. Experimental results on two GitHub repository
collections confirm that HiGitClass is superior to existing weakly-supervised
and dataless hierarchical classification methods, especially in its ability to
integrate both structured and unstructured data for repository classification.Comment: 10 pages; Accepted to ICDM 2019; Some typos fixe
- …