6,317 research outputs found
Distilling Word Embeddings: An Encoding Approach
Distilling knowledge from a well-trained cumbersome network to a small one
has recently become a new research topic, as lightweight neural networks with
high performance are particularly in need in various resource-restricted
systems. This paper addresses the problem of distilling word embeddings for NLP
tasks. We propose an encoding approach to distill task-specific knowledge from
a set of high-dimensional embeddings, which can reduce model complexity by a
large margin as well as retain high accuracy, showing a good compromise between
efficiency and performance. Experiments in two tasks reveal the phenomenon that
distilling knowledge from cumbersome embeddings is better than directly
training neural networks with small embeddings.Comment: Accepted by CIKM-16 as a short paper, and by the Representation
Learning for Natural Language Processing (RL4NLP) Workshop @ACL-16 for
presentatio
Determination of incommensurate modulated structure in Bi2Sr1.6La0.4CuO6+{\delta} by aberration-corrected transmission electron microscopy
Incommensurate modulated structure (IMS) in Bi2Sr1.6La0.4CuO6+{\delta}
(BSLCO) has been studied by aberration corrected transmission electron
microscopy in combination with high-dimensional (HD) space description. Two
images in the negative Cs imaging (NCSI) and passive Cs imaging (PCSI) modes
were deconvoluted, respectively. Similar results as to IMS have been obtained
from two corresponding projected potential maps (PPMs), but meanwhile the size
of dots representing atoms in the NCSI PPM is found to be smaller than that in
PCSI one. Considering that size is one of influencing factors of precision,
modulation functions for all unoverlapped atoms in BSLCO were determined based
on the PPM obtained from the NCSI image in combination with HD space
description
Self-Edit: Fault-Aware Code Editor for Code Generation
Large language models (LLMs) have demonstrated an impressive ability to
generate codes on competitive programming tasks. However, with limited sample
numbers, LLMs still suffer from poor accuracy. Inspired by the process of human
programming, we propose a generate-and-edit approach named Self-Edit that
utilizes execution results of the generated code from LLMs to improve the code
quality on the competitive programming task. We execute the generated code on
the example test case provided in the question and wrap execution results into
a supplementary comment. Utilizing this comment as guidance, our fault-aware
code editor is employed to correct errors in the generated code. We perform
extensive evaluations across two competitive programming datasets with nine
different LLMs. Compared to directly generating from LLMs, our approach can
improve the average of pass@1 by 89\% on APPS-dev, 31\% on APPS-test, and 48\%
on HumanEval over nine popular code generation LLMs with parameter sizes
ranging from 110M to 175B. Compared to other post-processing methods, our
method demonstrates superior accuracy and efficiency.Comment: Accepted by ACL202
Towards Enhancing In-Context Learning for Code Generation
In-context learning (ICL) with pre-trained language models (PTLMs) has shown
great success in code generation. ICL does not require training. PTLMs take as
the input a prompt consisting of a few requirement-code examples and a new
requirement, and output a new program. However, existing studies simply reuse
ICL techniques for natural language generation and ignore unique features of
code generation. We refer to these studies as standard ICL.
Inspired by observations of the human coding process, we propose a novel ICL
approach for code generation named AceCoder. Compared to standard ICL, AceCoder
has two novelties. (1) Example retrieval. It retrieves similar programs as
examples and learns programming skills (e.g., algorithms, APIs) from them. (2)
Guided Code Generation. It encourages PTLMs to output an intermediate
preliminary (e.g., test cases, APIs) before generating programs. The
preliminary can help PTLMs understand requirements and guide the next code
generation. We apply AceCoder to six PTLMs (e.g., Codex) and evaluate it on
three public benchmarks using the Pass@k. Results show that AceCoder can
significantly improve the performance of PTLMs on code generation. (1) In terms
of Pass@1, AceCoder outperforms standard ICL by up to 79.7% and fine-tuned
models by up to 171%. (2) AceCoder is effective in PTLMs with different sizes
(e.g., 1B to 175B) and different languages (e.g., Python, Java, and
JavaScript). (3) We investigate multiple choices of the intermediate
preliminary. (4) We manually evaluate generated programs in three aspects and
prove the superiority of AceCoder. (5) Finally, we discuss some insights about
ICL for practitioners
Improving Code Generation by Dynamic Temperature Sampling
Recently, Large Language Models (LLMs) have shown impressive results in code
generation. However, existing decoding strategies are designed for Natural
Language (NL) generation, overlooking the differences between NL and
programming languages (PL). Due to this oversight, a better decoding strategy
for code generation remains an open question. In this paper, we conduct the
first systematic study to explore a decoding strategy specialized in code
generation. With an analysis of loss distributions of code tokens, we find that
code tokens can be divided into two categories: challenging tokens that are
difficult to predict and confident tokens that can be easily inferred. Among
them, the challenging tokens mainly appear at the beginning of a code block.
Inspired by the above findings, we propose a simple yet effective method:
Adaptive Temperature (AdapT) sampling, which dynamically adjusts the
temperature coefficient when decoding different tokens. We apply a larger
temperature when sampling for challenging tokens, allowing LLMs to explore
diverse choices. We employ a smaller temperature for confident tokens avoiding
the influence of tail randomness noises. We apply AdapT sampling to LLMs with
different sizes and conduct evaluations on two popular datasets. Results show
that AdapT sampling significantly outperforms state-of-the-art decoding
strategy
- …