21 research outputs found
Active Learning for Natural Language Generation
The field of text generation suffers from a severe shortage of labeled data
due to the extremely expensive and time consuming process involved in manual
annotation. A natural approach for coping with this problem is active learning
(AL), a well-known machine learning technique for improving annotation
efficiency by selectively choosing the most informative examples to label.
However, while AL has been well-researched in the context of text
classification, its application to text generation remained largely unexplored.
In this paper, we present a first systematic study of active learning for text
generation, considering a diverse set of tasks and multiple leading AL
strategies. Our results indicate that existing AL strategies, despite their
success in classification, are largely ineffective for the text generation
scenario, and fail to consistently surpass the baseline of random example
selection. We highlight some notable differences between the classification and
generation scenarios, and analyze the selection behaviors of existing AL
strategies. Our findings motivate exploring novel approaches for applying AL to
NLG tasks
Efficient Benchmarking (of Language Models)
The increasing versatility of language models LMs has given rise to a new
class of benchmarks that comprehensively assess a broad range of capabilities.
Such benchmarks are associated with massive computational costs reaching
thousands of GPU hours per model. However the efficiency aspect of these
evaluation efforts had raised little discussion in the literature. In this work
we present the problem of Efficient Benchmarking namely intelligently reducing
the computation costs of LM evaluation without compromising reliability. Using
the HELM benchmark as a test case we investigate how different benchmark design
choices affect the computation-reliability tradeoff. We propose to evaluate the
reliability of such decisions by using a new measure Decision Impact on
Reliability DIoR for short. We find for example that the current leader on HELM
may change by merely removing a low-ranked model from the benchmark and observe
that a handful of examples suffice to obtain the correct benchmark ranking.
Conversely a slightly different choice of HELM scenarios varies ranking widely.
Based on our findings we outline a set of concrete recommendations for more
efficient benchmark design and utilization practices leading to dramatic cost
savings with minimal loss of benchmark reliability often reducing computation
by x100 or more
Multi-Choice Minority Game
The generalization of the problem of adaptive competition, known as the
minority game, to the case of possible choices for each player is
addressed, and applied to a system of interacting perceptrons with input and
output units of the type of -states Potts-spins. An optimal solution of this
minority game as well as the dynamic evolution of the adaptive strategies of
the players are solved analytically for a general and compared with
numerical simulations.Comment: 5 pages, 2 figures, reorganized and clarifie
Low autocorrelated multi-phase sequences
The interplay between the ground state energy of the generalized Bernasconi
model to multi-phase, and the minimal value of the maximal autocorrelation
function, , , is examined analytically and
the main results are: (a) The minimal value of is
significantly smaller than the typical value for random
sequences . (b) over all sequences
of length N is obtained in an energy which is about 30% above the ground-state
energy of the generalized Bernasconi model, independent of the number of phases
m. (c) The maximal merit factor grows linearly with m. (d) For a
given N, indicating that for m=N,
, i.e. a Barker code exits. The analytical results are
confirmed by simulations.Comment: 4 pages, 4 figure
The dynamics of proving uncolourability of large random graphs I. Symmetric Colouring Heuristic
We study the dynamics of a backtracking procedure capable of proving
uncolourability of graphs, and calculate its average running time T for sparse
random graphs, as a function of the average degree c and the number of vertices
N. The analysis is carried out by mapping the history of the search process
onto an out-of-equilibrium (multi-dimensional) surface growth problem. The
growth exponent of the average running time is quantitatively predicted, in
agreement with simulations.Comment: 5 figure
Corpus Wide Argument Mining -- a Working Solution
One of the main tasks in argument mining is the retrieval of argumentative
content pertaining to a given topic. Most previous work addressed this task by
retrieving a relatively small number of relevant documents as the initial
source for such content. This line of research yielded moderate success, which
is of limited use in a real-world system. Furthermore, for such a system to
yield a comprehensive set of relevant arguments, over a wide range of topics,
it requires leveraging a large and diverse corpus in an appropriate manner.
Here we present a first end-to-end high-precision, corpus-wide argument mining
system. This is made possible by combining sentence-level queries over an
appropriate indexing of a very large corpus of newspaper articles, with an
iterative annotation scheme. This scheme addresses the inherent label bias in
the data and pinpoints the regions of the sample space whose manual labeling is
required to obtain high-precision among top-ranked candidates
Generation of unpredictable time series by a Neural Network
A perceptron that learns the opposite of its own output is used to generate a
time series. We analyse properties of the weight vector and the generated
sequence, like the cycle length and the probability distribution of generated
sequences. A remarkable suppression of the autocorrelation function is
explained, and connections to the Bernasconi model are discussed. If a
continuous transfer function is used, the system displays chaotic and
intermittent behaviour, with the product of the learning rate and amplification
as a control parameter.Comment: 11 pages, 14 figures; slightly expanded and clarified, mistakes
corrected; accepted for publication in PR