48,365 research outputs found
Generating multimedia presentations: from plain text to screenplay
In many Natural Language Generation (NLG) applications, the output is limited to plain text – i.e., a string of words with punctuation and paragraph breaks, but no indications for layout, or pictures, or dialogue. In several projects, we have begun to explore NLG applications in which these extra media are brought into play. This paper gives an informal account of what we have learned. For coherence, we focus on the domain of patient information leaflets, and follow an example in which the same content is expressed first in plain text, then in formatted text, then in text with pictures, and finally in a dialogue script that can be performed by two animated agents. We show how the same meaning can be mapped to realisation patterns in different media, and how the expanded options for expressing meaning are related to the perceived style and tone of the presentation. Throughout, we stress that the extra media are not simple added to plain text, but integrated with it: thus the use of formatting, or pictures, or dialogue, may require radical rewording of the text itself
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts
Word embedding is a Natural Language Processing (NLP) technique that
automatically maps words from a vocabulary to vectors of real numbers in an
embedding space. It has been widely used in recent years to boost the
performance of a vari-ety of NLP tasks such as Named Entity Recognition,
Syntac-tic Parsing and Sentiment Analysis. Classic word embedding methods such
as Word2Vec and GloVe work well when they are given a large text corpus. When
the input texts are sparse as in many specialized domains (e.g.,
cybersecurity), these methods often fail to produce high-quality vectors. In
this pa-per, we describe a novel method to train domain-specificword embeddings
from sparse texts. In addition to domain texts, our method also leverages
diverse types of domain knowledge such as domain vocabulary and semantic
relations. Specifi-cally, we first propose a general framework to encode
diverse types of domain knowledge as text annotations. Then we de-velop a novel
Word Annotation Embedding (WAE) algorithm to incorporate diverse types of text
annotations in word em-bedding. We have evaluated our method on two
cybersecurity text corpora: a malware description corpus and a Common
Vulnerability and Exposure (CVE) corpus. Our evaluation re-sults have
demonstrated the effectiveness of our method in learning domain-specific word
embeddings
Neurocognitive Informatics Manifesto.
Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given
- …