15,814 research outputs found
Domain transfer for deep natural language generation from abstract meaning representations
Stochastic natural language generation systems that are trained from labelled datasets are often domainspecific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short-term memory recurrent neural network encoderdecoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%
An Ontology-Based Method for Semantic Integration of Business Components
Building new business information systems from reusable components is today
an approach widely adopted and used. Using this approach in analysis and design
phases presents a great interest and requires the use of a particular class of
components called Business Components (BC). Business Components are today
developed by several manufacturers and are available in many repositories.
However, reusing and integrating them in a new Information System requires
detection and resolution of semantic conflicts. Moreover, most of integration
and semantic conflict resolution systems rely on ontology alignment methods
based on domain ontology. This work is positioned at the intersection of two
research areas: Integration of reusable Business Components and alignment of
ontologies for semantic conflict resolution. Our contribution concerns both the
proposal of a BC integration solution based on ontologies alignment and a
method for enriching the domain ontology used as a support for alignment.Comment: IEEE New Technologies of Distributed Systems (NOTERE), 2011 11th
Annual International Conference; ISSN: 2162-1896 Print ISBN:
978-1-4577-0729-2 INSPEC Accession Number: 12122775 201
Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time
Crowd-powered conversational assistants have been shown to be more robust
than automated systems, but do so at the cost of higher response latency and
monetary costs. A promising direction is to combine the two approaches for high
quality, low latency, and low cost solutions. In this paper, we introduce
Evorus, a crowd-powered conversational assistant built to automate itself over
time by (i) allowing new chatbots to be easily integrated to automate more
scenarios, (ii) reusing prior crowd answers, and (iii) learning to
automatically approve response candidates. Our 5-month-long deployment with 80
participants and 281 conversations shows that Evorus can automate itself
without compromising conversation quality. Crowd-AI architectures have long
been proposed as a way to reduce cost and latency for crowd-powered systems;
Evorus demonstrates how automation can be introduced successfully in a deployed
system. Its architecture allows future researchers to make further innovation
on the underlying automated components in the context of a deployed open domain
dialog system.Comment: 10 pages. To appear in the Proceedings of the Conference on Human
Factors in Computing Systems 2018 (CHI'18
MultiVeStA: Statistical Model Checking for Discrete Event Simulators
The modeling, analysis and performance evaluation of large-scale systems are difficult tasks. Due to the size and complexity of the considered systems, an approach typically followed by engineers consists in performing simulations of systems models to obtain statistical estimations of quantitative properties. Similarly, a technique used by computer scientists working on quantitative analysis is Statistical Model Checking (SMC), where rigorous mathematical languages (typically logics) are used to express systems properties of interest. Such properties can then be automatically estimated by tools performing simulations of the model at hand. These property specifications languages, often not popular among engineers, provide a formal, compact and elegant way to express systems properties without needing to hard-code them in the model definition. This paper presents MultiVeStA, a statistical analysis tool which can be easily integrated with existing discrete event simulators, enriching them with efficient distributed statistical analysis and SMC capabilities
Exploiting multi-word units in history-based probabilistic generation
We present a simple history-based model for sentence generation from LFG f-structures, which improves on the accuracy of previous models by breaking down PCFG independence assumptions so that more f-structure conditioning context is used in the prediction of grammar rule expansions. In addition, we present work on experiments with named entities and other multi-word units,
showing a statistically significant improvement of generation accuracy. Tested on section 23 of the PennWall Street Journal Treebank, the techniques described in this paper improve BLEU scores from 66.52 to 68.82, and coverage from 98.18% to 99.96%
Recommended from our members
ICOPER Project - Deliverable 4.3 ISURE: Recommendations for extending effective reuse, embodied in the ICOPER CD&R
The purpose of this document is to capture the ideas and recommendations, within and beyond the ICOPER community, concerning the reuse of learning content, including appropriate methodologies as well as established strategies for remixing and repurposing reusable resources. The overall remit of this work focuses on describing the key issues that are related to extending effective reuse embodied in such materials. The objective of this investigation, is to support the reuse of learning content whilst considering how it could be originally created and then adapted with that āreuseā in mind. In these circumstances a survey on effective reuse best practices can often provide an insight into the main challenges and benefits involved in the process of creating, remixing and repurposing what we are now designating as Reusable Learning Content (RLC).
Several key issues are analysed in this report: Recommendations for extending effective reuse, building upon those described in the previous related deliverables 4.1 Content Development Methodologies and 4.2 Quality Control and Web 2.0 technologies. The findings of this current survey, however, provide further recommendations and strategies for using and developing this reusable learning content. In the spirit of āreuseā, this work also aims to serve as a foundation for the many different stakeholders and users within, and beyond, the ICOPER community who are interested in reusing learning resources.
This report analyses a variety of information. Evidence has been gathered from a qualitative survey that has focused on the technical and pedagogical recommendations suggested by a Special Interest Group (SIG) on the most innovative practices with respect to new media content authors (for content authoring or modification) and course designers (for unit creation). This extended community includes a wider collection of OER specialists. This collected evidence, in the form of video and audio interviews, has also been represented as multimedia assets potentially helpful for learning and useful as learning content in the New Media Space (See section 4 for further details).
Section 2 of this report introduces the concept of reusable learning content and reusability. Section 3 discusses an application created by the ICOPER community to enhance the opportunities for developing reusable content. Section 4 of this report provides an overview of the methodology used for the qualitative survey. Section 5 presents a summary of thematic findings. Section 6 highlights a list of recommendations for effective reuse of educational content, which were derived from thematic analysis described in Appendix A. Finally, section 7 summarises the key outcomes of this work
Recommended from our members
Requirements for software engineering languages
This paper analyzes the concepts of software construction embodied in the Draco system. The analysis relates specific mechanisms in Draco to particular software engineering (SE) principles and suggests future research needed to extend the approach. The purpose of the analysis is to help researchers understand Draco better and thus be able to direct in productive directions future research on this type of software engineering tool
- ā¦