17,646 research outputs found
A Flexible Shallow Approach to Text Generation
In order to support the efficient development of NL generation systems, two
orthogonal methods are currently pursued with emphasis: (1) reusable, general,
and linguistically motivated surface realization components, and (2) simple,
task-oriented template-based techniques. In this paper we argue that, from an
application-oriented perspective, the benefits of both are still limited. In
order to improve this situation, we suggest and evaluate shallow generation
methods associated with increased flexibility. We advise a close connection
between domain-motivated and linguistic ontologies that supports the quick
adaptation to new tasks and domains, rather than the reuse of general
resources. Our method is especially designed for generating reports with
limited linguistic variations.Comment: LaTeX, 10 page
Recommended from our members
The effect of multiple knowledge sources on learning and teaching
Current paradigms for machine-based learning and teaching tend to perform their task in isolation from a rich context of existing knowledge. In contrast, the research project presented here takes the view that bringing multiple sources of knowledge to bear is of central importance to learning in complex domains. As a consequence teaching must both take advantage of and beware of interactions between new and existing knowledge. The central process which connects learning to its context is reasoning by analogy, a primary concern of this research. In teaching, the connection is provided by the explicit use of a learning model to reason about the choice of teaching actions. In this learning paradigm, new concepts are incrementally refined and integrated into a body of expertise, rather than being evaluated against a static notion of correctness. The domain chosen for this experimentation is that of learning to solve "algebra story problems." A model of acquiring problem solving skills in this domain is described, including: representational structures for background knowledge, a problem solving architecture, learning mechanisms, and the role of analogies in applying existing problem solving abilities to novel problems. Examples of learning are given for representative instances of algebra story problems. After relating our views to the psychological literature, we outline the design of a teaching system. Finally, we insist on the interdependence of learning and teaching and on the synergistic effects of conducting both research efforts in parallel
Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases
A critical challenge in constructing a natural language interface to database
(NLIDB) is bridging the semantic gap between a natural language query (NLQ) and
the underlying data. Two specific ways this challenge exhibits itself is
through keyword mapping and join path inference. Keyword mapping is the task of
mapping individual keywords in the original NLQ to database elements (such as
relations, attributes or values). It is challenging due to the ambiguity in
mapping the user's mental model and diction to the schema definition and
contents of the underlying database. Join path inference is the process of
selecting the relations and join conditions in the FROM clause of the final SQL
query, and is difficult because NLIDB users lack the knowledge of the database
schema or SQL and therefore cannot explicitly specify the intermediate tables
and joins needed to construct a final SQL query. In this paper, we propose
leveraging information from the SQL query log of a database to enhance the
performance of existing NLIDBs with respect to these challenges. We present a
system Templar that can be used to augment existing NLIDBs. Our extensive
experimental evaluation demonstrates the effectiveness of our approach, leading
up to 138% improvement in top-1 accuracy in existing NLIDBs by leveraging SQL
query log information.Comment: Accepted to IEEE International Conference on Data Engineering (ICDE)
201
An infrastructure for building semantic web portals
In this paper, we present our KMi semantic web portal infrastructure, which supports two important tasks of semantic web portals, namely metadata extraction and data querying. Central to our infrastructure are three components: i) an automated metadata extraction tool, ASDI, which supports the extraction of high quality metadata from heterogeneous sources, ii) an ontology-driven question answering tool, AquaLog, which makes use of the domain specific ontology and the semantic metadata extracted by ASDI to answers questions in natural language format, and iii) a semantic search engine, which enhances traditional
text-based searching by making use of the underlying ontologies and the extracted metadata. A semantic web portal application has been built, which illustrates the usage of this infrastructure
Knowledge Rich Natural Language Queries over Structured Biological Databases
Increasingly, keyword, natural language and NoSQL queries are being used for
information retrieval from traditional as well as non-traditional databases
such as web, document, image, GIS, legal, and health databases. While their
popularity are undeniable for obvious reasons, their engineering is far from
simple. In most part, semantics and intent preserving mapping of a well
understood natural language query expressed over a structured database schema
to a structured query language is still a difficult task, and research to tame
the complexity is intense. In this paper, we propose a multi-level
knowledge-based middleware to facilitate such mappings that separate the
conceptual level from the physical level. We augment these multi-level
abstractions with a concept reasoner and a query strategy engine to dynamically
link arbitrary natural language querying to well defined structured queries. We
demonstrate the feasibility of our approach by presenting a Datalog based
prototype system, called BioSmart, that can compute responses to arbitrary
natural language queries over arbitrary databases once a syntactic
classification of the natural language query is made
GATE -- an Environment to Support Research and Development in Natural Language Engineering
We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available
- …