5 research outputs found
nBIIG: A Neural BI Insights Generation System for Table Reporting
We present nBIIG, a neural Business Intelligence (BI) Insights Generation
system. Given a table, our system applies various analyses to create
corresponding RDF representations, and then uses a neural model to generate
fluent textual insights out of these representations. The generated insights
can be used by an analyst, via a human-in-the-loop paradigm, to enhance the
task of creating compelling table reports. The underlying generative neural
model is trained over large and carefully distilled data, curated from multiple
BI domains. Thus, the system can generate faithful and fluent insights over
open-domain tables, making it practical and useful.Comment: Accepted to AAAI-2
Active Learning for Natural Language Generation
The field of text generation suffers from a severe shortage of labeled data
due to the extremely expensive and time consuming process involved in manual
annotation. A natural approach for coping with this problem is active learning
(AL), a well-known machine learning technique for improving annotation
efficiency by selectively choosing the most informative examples to label.
However, while AL has been well-researched in the context of text
classification, its application to text generation remained largely unexplored.
In this paper, we present a first systematic study of active learning for text
generation, considering a diverse set of tasks and multiple leading AL
strategies. Our results indicate that existing AL strategies, despite their
success in classification, are largely ineffective for the text generation
scenario, and fail to consistently surpass the baseline of random example
selection. We highlight some notable differences between the classification and
generation scenarios, and analyze the selection behaviors of existing AL
strategies. Our findings motivate exploring novel approaches for applying AL to
NLG tasks
Efficient Benchmarking (of Language Models)
The increasing versatility of language models LMs has given rise to a new
class of benchmarks that comprehensively assess a broad range of capabilities.
Such benchmarks are associated with massive computational costs reaching
thousands of GPU hours per model. However the efficiency aspect of these
evaluation efforts had raised little discussion in the literature. In this work
we present the problem of Efficient Benchmarking namely intelligently reducing
the computation costs of LM evaluation without compromising reliability. Using
the HELM benchmark as a test case we investigate how different benchmark design
choices affect the computation-reliability tradeoff. We propose to evaluate the
reliability of such decisions by using a new measure Decision Impact on
Reliability DIoR for short. We find for example that the current leader on HELM
may change by merely removing a low-ranked model from the benchmark and observe
that a handful of examples suffice to obtain the correct benchmark ranking.
Conversely a slightly different choice of HELM scenarios varies ranking widely.
Based on our findings we outline a set of concrete recommendations for more
efficient benchmark design and utilization practices leading to dramatic cost
savings with minimal loss of benchmark reliability often reducing computation
by x100 or more
Diversity Enhanced Table-to-Text Generation via Type Control
Generating natural language statements to convey information from tabular
data (i.e., Table-to-text) is a process with one input and a variety of valid
outputs. This characteristic underscores the abilities to control the
generation and produce a diverse set of outputs as two key assets. Thus, we
propose a diversity enhancing scheme that builds upon an inherent property of
the statements, namely, their logic-types, by using a type-controlled
Table-to-text generation model. Employing automatic and manual tests, we prove
its twofold advantage: users can effectively tune the generated statement type,
and, by sampling different types, can obtain a diverse set of statements for a
given table.Comment: 4 pages, 4 figure