217 research outputs found
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
From pre-trained language model (PLM) to large language model (LLM), the
field of natural language processing (NLP) has witnessed steep performance
gains and wide practical uses. The evaluation of a research field guides its
direction of improvement. However, LLMs are extremely hard to thoroughly
evaluate for two reasons. First of all, traditional NLP tasks become inadequate
due to the excellent performance of LLM. Secondly, existing evaluation tasks
are difficult to keep up with the wide range of applications in real-world
scenarios. To tackle these problems, existing works proposed various benchmarks
to better evaluate LLMs. To clarify the numerous evaluation tasks in both
academia and industry, we investigate multiple papers concerning LLM
evaluations. We summarize 4 core competencies of LLM, including reasoning,
knowledge, reliability, and safety. For every competency, we introduce its
definition, corresponding benchmarks, and metrics. Under this competency
architecture, similar tasks are combined to reflect corresponding ability,
while new tasks can also be easily added into the system. Finally, we give our
suggestions on the future direction of LLM's evaluation
Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory
Episodic and semantic memory are critical components of the human memory
model. The theory of complementary learning systems (McClelland et al., 1995)
suggests that the compressed representation produced by a serial event
(episodic memory) is later restructured to build a more generalized form of
reusable knowledge (semantic memory). In this work we develop a new principled
Bayesian memory allocation scheme that bridges the gap between episodic and
semantic memory via a hierarchical latent variable model. We take inspiration
from traditional heap allocation and extend the idea of locally contiguous
memory to the Kanerva Machine, enabling a novel differentiable block allocated
latent memory. In contrast to the Kanerva Machine, we simplify the process of
memory writing by treating it as a fully feed forward deterministic process,
relying on the stochasticity of the read key distribution to disperse
information within the memory. We demonstrate that this allocation scheme
improves performance in memory conditional image generation, resulting in new
state-of-the-art conditional likelihood values on binarized MNIST (<=41.58
nats/image) , binarized Omniglot (<=66.24 nats/image), as well as presenting
competitive performance on CIFAR10, DMLab Mazes, Celeb-A and ImageNet32x32
Deep Learning for Text Style Transfer: A Survey
Text style transfer is an important task in natural language generation,
which aims to control certain attributes in the generated text, such as
politeness, emotion, humor, and many others. It has a long history in the field
of natural language processing, and recently has re-gained significant
attention thanks to the promising performance brought by deep neural models. In
this paper, we present a systematic survey of the research on neural text style
transfer, spanning over 100 representative articles since the first neural text
style transfer work in 2017. We discuss the task formulation, existing datasets
and subtasks, evaluation, as well as the rich methodologies in the presence of
parallel and non-parallel data. We also provide discussions on a variety of
important topics regarding the future development of this task. Our curated
paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202
SHOT-VAE: Semi-supervised Deep Generative Models With Label-aware ELBO Approximations
Semi-supervised variational autoencoders (VAEs) have obtained strong results,
but have also encountered the challenge that good ELBO values do not always
imply accurate inference results. In this paper, we investigate and propose two
causes of this problem: (1) The ELBO objective cannot utilize the label
information directly. (2) A bottleneck value exists and continuing to optimize
ELBO after this value will not improve inference accuracy. On the basis of the
experiment results, we propose SHOT-VAE to address these problems without
introducing additional prior knowledge. The SHOT-VAE offers two contributions:
(1) A new ELBO approximation named smooth-ELBO that integrates the label
predictive loss into ELBO. (2) An approximation based on optimal interpolation
that breaks the ELBO value bottleneck by reducing the margin between ELBO and
the data likelihood. The SHOT-VAE achieves good performance with a 25.30% error
rate on CIFAR-100 with 10k labels and reduces the error rate to 6.11% on
CIFAR-10 with 4k labels.Comment: 12 pages, 6 figures, Accepted for presentation at AAAI202
Learning Programmatically Structured Representations with Perceptor Gradients
We present the perceptor gradients algorithm -- a novel approach to learning
symbolic representations based on the idea of decomposing an agent's policy
into i) a perceptor network extracting symbols from raw observation data and
ii) a task encoding program which maps the input symbols to output actions. We
show that the proposed algorithm is able to learn representations that can be
directly fed into a Linear-Quadratic Regulator (LQR) or a general purpose A*
planner. Our experimental results confirm that the perceptor gradients
algorithm is able to efficiently learn transferable symbolic representations as
well as generate new observations according to a semantically meaningful
specification.Comment: Published as a conference paper at ICLR 201
Learning Likelihoods with Conditional Normalizing Flows
Normalizing Flows (NFs) are able to model complicated distributions p(y) with
strong inter-dimensional correlations and high multimodality by transforming a
simple base density p(z) through an invertible neural network under the change
of variables formula. Such behavior is desirable in multivariate structured
prediction tasks, where handcrafted per-pixel loss-based methods inadequately
capture strong correlations between output dimensions. We present a study of
conditional normalizing flows (CNFs), a class of NFs where the base density to
output space mapping is conditioned on an input x, to model conditional
densities p(y|x). CNFs are efficient in sampling and inference, they can be
trained with a likelihood-based objective, and CNFs, being generative flows, do
not suffer from mode collapse or training instabilities. We provide an
effective method to train continuous CNFs for binary problems and in
particular, we apply these CNFs to super-resolution and vessel segmentation
tasks demonstrating competitive performance on standard benchmark datasets in
terms of likelihood and conventional metrics.Comment: 18 pages, 8 Tables, 9 Figures, Preprin
- …