9,807 research outputs found
The Validation of Speech Corpora
1.2 Intended audience........................
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
Structural Learning of Attack Vectors for Generating Mutated XSS Attacks
Web applications suffer from cross-site scripting (XSS) attacks that
resulting from incomplete or incorrect input sanitization. Learning the
structure of attack vectors could enrich the variety of manifestations in
generated XSS attacks. In this study, we focus on generating more threatening
XSS attacks for the state-of-the-art detection approaches that can find
potential XSS vulnerabilities in Web applications, and propose a mechanism for
structural learning of attack vectors with the aim of generating mutated XSS
attacks in a fully automatic way. Mutated XSS attack generation depends on the
analysis of attack vectors and the structural learning mechanism. For the
kernel of the learning mechanism, we use a Hidden Markov model (HMM) as the
structure of the attack vector model to capture the implicit manner of the
attack vector, and this manner is benefited from the syntax meanings that are
labeled by the proposed tokenizing mechanism. Bayes theorem is used to
determine the number of hidden states in the model for generalizing the
structure model. The paper has the contributions as following: (1)
automatically learn the structure of attack vectors from practical data
analysis to modeling a structure model of attack vectors, (2) mimic the manners
and the elements of attack vectors to extend the ability of testing tool for
identifying XSS vulnerabilities, (3) be helpful to verify the flaws of
blacklist sanitization procedures of Web applications. We evaluated the
proposed mechanism by Burp Intruder with a dataset collected from public XSS
archives. The results show that mutated XSS attack generation can identify
potential vulnerabilities.Comment: In Proceedings TAV-WEB 2010, arXiv:1009.330
Why We Need New Evaluation Metrics for NLG
The majority of NLG evaluation relies on automatic metrics, such as BLEU . In
this paper, we motivate the need for novel, system- and data-independent
automatic evaluation methods: We investigate a wide range of metrics, including
state-of-the-art word-based and novel grammar-based ones, and demonstrate that
they only weakly reflect human judgements of system outputs as generated by
data-driven, end-to-end NLG. We also show that metric performance is data- and
system-specific. Nevertheless, our results also suggest that automatic metrics
perform reliably at system-level and can support system development by finding
cases where a system performs poorly.Comment: accepted to EMNLP 201
Web-based environment for user generation of spoken dialog for virtual assistants
In this paper, a web-based spoken dialog generation environment which enables users to edit dialogs with a video virtual assistant is developed and to also select the 3D motions and tone of voice for the assistant. In our proposed system, “anyone” can “easily” post/edit contents of the dialog for the dialog system. The dialog type corresponding to the system is limited to the question-and-answer type dialog, in order to avoid editing conflicts caused by editing by multiple users. The spoken dialog sharing service and FST generator generates spoken dialog content for the MMDAgent spoken dialog system toolkit, which includes a speech recognizer, a dialog control unit, a speech synthesizer, and a virtual agent. For dialog content creation, question-and-answer dialogs posted by users and FST templates are used. The proposed system was operated for more than a year in a student lounge at the Nagoya Institute of Technology, where users added more than 500 dialogs during the experiment. Images were also registered to 65% of the postings. The most posted category is related to “animation, video games, manga.”
The system was subjected to open examination by tourist information staff who had no prior experience with spoken dialog systems. Based on their impressions of tourist use of the dialog system, they shortened the length of some of the system’s responses and added pauses to the longer responses to make them easier to understand
- …