5,373 research outputs found
WEST: A Web Browser for Small Terminals
We describe WEST, a WEb browser for Small Terminals, that aims to solve some of the problems associated with accessing web pages on hand-held devices. Through a novel combination of text reduction and focus+context visualization, users can access web pages from a very limited display environment, since the system will provide an overview of the contents of a web page even when it is too large to be displayed in its entirety. To make maximum use of the limited resources available on a typical hand-held terminal, much of the most demanding work is done by a proxy server, allowing the terminal to concentrate on the task of providing responsive user interaction. The system makes use of some interaction concepts reminiscent of those defined in the Wireless Application Protocol (WAP), making it possible to utilize the techniques described here for WAP-compliant devices and services that may become available in the near future
Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps
Concept maps can be used to concisely represent important information and
bring structure into large document collections. Therefore, we study a variant
of multi-document summarization that produces summaries in the form of concept
maps. However, suitable evaluation datasets for this task are currently
missing. To close this gap, we present a newly created corpus of concept maps
that summarize heterogeneous collections of web documents on educational
topics. It was created using a novel crowdsourcing approach that allows us to
efficiently determine important elements in large document collections. We
release the corpus along with a baseline system and proposed evaluation
protocol to enable further research on this variant of summarization.Comment: Published at EMNLP 201
Domain transfer for deep natural language generation from abstract meaning representations
Stochastic natural language generation systems that are trained from labelled datasets are often domainspecific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short-term memory recurrent neural network encoderdecoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%
Sciunits: Reusable Research Objects
Science is conducted collaboratively, often requiring knowledge sharing about
computational experiments. When experiments include only datasets, they can be
shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers
(DOIs). An experiment, however, seldom includes only datasets, but more often
includes software, its past execution, provenance, and associated
documentation. The Research Object has recently emerged as a comprehensive and
systematic method for aggregation and identification of diverse elements of
computational experiments. While a necessary method, mere aggregation is not
sufficient for the sharing of computational experiments. Other users must be
able to easily recompute on these shared research objects. In this paper, we
present the sciunit, a reusable research object in which aggregated content is
recomputable. We describe a Git-like client that efficiently creates, stores,
and repeats sciunits. We show through analysis that sciunits repeat
computational experiments with minimal storage and processing overhead.
Finally, we provide an overview of sharing and reproducible cyberinfrastructure
based on sciunits gaining adoption in the domain of geosciences
- …