4,260 research outputs found
Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey
The emergence of natural language processing has revolutionized the way users
interact with tabular data, enabling a shift from traditional query languages
and manual plotting to more intuitive, language-based interfaces. The rise of
large language models (LLMs) such as ChatGPT and its successors has further
advanced this field, opening new avenues for natural language processing
techniques. This survey presents a comprehensive overview of natural language
interfaces for tabular data querying and visualization, which allow users to
interact with data using natural language queries. We introduce the fundamental
concepts and techniques underlying these interfaces with a particular emphasis
on semantic parsing, the key technology facilitating the translation from
natural language to SQL queries or data visualization commands. We then delve
into the recent advancements in Text-to-SQL and Text-to-Vis problems from the
perspectives of datasets, methodologies, metrics, and system designs. This
includes a deep dive into the influence of LLMs, highlighting their strengths,
limitations, and potential for future improvements. Through this survey, we aim
to provide a roadmap for researchers and practitioners interested in developing
and applying natural language interfaces for data interaction in the era of
large language models.Comment: 20 pages, 4 figures, 5 tables. Submitted to IEEE TKD
Efficient Tabular LR Parsing
We give a new treatment of tabular LR parsing, which is an alternative to
Tomita's generalized LR algorithm. The advantage is twofold. Firstly, our
treatment is conceptually more attractive because it uses simpler concepts,
such as grammar transformations and standard tabulation techniques also know as
chart parsing. Secondly, the static and dynamic complexity of parsing, both in
space and time, is significantly reduced.Comment: 8 pages, uses aclap.st
Tabular Parsing
This is a tutorial on tabular parsing, on the basis of tabulation of
nondeterministic push-down automata. Discussed are Earley's algorithm, the
Cocke-Kasami-Younger algorithm, tabular LR parsing, the construction of parse
trees, and further issues.Comment: 21 pages, 14 figure
Interleaving natural language parsing and generation through uniform processing
We present a new model of natural language processing in which natural language parsing and generation are strongly interleaved tasks. Interleaving of parsing and generation is important if we assume that natural language understanding and production are not only performed in isolation but also can work together to obtain subsentential interactions in text revision or dialog systems. The core of the model is a new uniform agenda-driven tabular algorithm, called UTA. Although uniformly defined, UTA is able to configure itself dynamically for either parsing or generation, because it is fully driven by the structure of the actual input - a string for parsing and a semantic expression for generation. Efficient interleaving of parsing and generation is obtained through item sharing between parsing and generation. This novel processing strategy facilitates exchanging items (i.e., partial results) computed in one direction automatically to the other direction as well. The advantage of UTA in combination with the item sharing method is that we are able to extend the use of memorization techniques even to the case of an interleaved approach. In order to demonstrate UTA\u27s utility for developing high-level performance methods, we present a new algorithm for incremental self-monitoring during natural language production
Interleaving natural language parsing and generation through uniform processing
We present a new model of natural language processing in which natural language parsing and generation are strongly interleaved tasks. Interleaving of parsing and generation is important if we assume that natural language understanding and production are not only performed in isolation but also can work together to obtain subsentential interactions in text revision or dialog systems. The core of the model is a new uniform agenda-driven tabular algorithm, called UTA. Although uniformly defined, UTA is able to configure itself dynamically for either parsing or generation, because it is fully driven by the structure of the actual input - a string for parsing and a semantic expression for generation. Efficient interleaving of parsing and generation is obtained through item sharing between parsing and generation. This novel processing strategy facilitates exchanging items (i.e., partial results) computed in one direction automatically to the other direction as well. The advantage of UTA in combination with the item sharing method is that we are able to extend the use of memorization techniques even to the case of an interleaved approach. In order to demonstrate UTA's utility for developing high-level performance methods, we present a new algorithm for incremental self-monitoring during natural language production
A Variant of Earley Parsing
The Earley algorithm is a widely used parsing method in natural language
processing applications. We introduce a variant of Earley parsing that is based
on a ``delayed'' recognition of constituents. This allows us to start the
recognition of a constituent only in cases in which all of its subconstituents
have been found within the input string. This is particularly advantageous in
several cases in which partial analysis of a constituent cannot be completed
and in general in all cases of productions sharing some suffix of their
right-hand sides (even for different left-hand side nonterminals). Although the
two algorithms result in the same asymptotic time and space complexity, from a
practical perspective our algorithm improves the time and space requirements of
the original method, as shown by reported experimental results.Comment: 12 pages, 1 Postscript figure, uses psfig.tex and llncs.st
A Web-Based Tool for Analysing Normative Documents in English
Our goal is to use formal methods to analyse normative documents written in
English, such as privacy policies and service-level agreements. This requires
the combination of a number of different elements, including information
extraction from natural language, formal languages for model representation,
and an interface for property specification and verification. We have worked on
a collection of components for this task: a natural language extraction tool, a
suitable formalism for representing such documents, an interface for building
models in this formalism, and methods for answering queries asked of a given
model. In this work, each of these concerns is brought together in a web-based
tool, providing a single interface for analysing normative texts in English.
Through the use of a running example, we describe each component and
demonstrate the workflow established by our tool
- …